Statistic Notebook for Entities

The following statistics can be created with this notebook:

  • How many projects, samples and datasets were created per time unit (day, week, month, year).

    The statistics are output per

    • LOGS group

    • Person (or filtered by a specific person)

    • Instrument (or filtered by a specific person) (only for statistic of datasets)

    Plots are created showing the respective number per year, month, calendar week and day and written to a html. In addition, csv files are created with the respective information.

    For samples, the statistics on “prepared” and “discarded” are created for each person. For projects, statistics on “creted on” and “modified on” are created for each person.

    For the datasets the acquisition date is used. For the statistics of projects per group “created on” is used. For the statistcs of projects per person “created on” and “modified on” is used.

  • Which and how many experiments, projects and samples were created per instrument.

    A chart per instrument is created as a html.

Important: The statistics are only generated for a period between 21.09.1677 and 11.04.2262. All other data is excluded from the statistics.If only the “discarded” time or “modified on” time falls within the period, the date is only removed from the statistics for “discarded”/”modified on”.

To run this notebook you need a Jupyter kernel. A kernel is a Python environment that executes the code from your notebook. If no Jupyter kernel is set up yet, you can register one with the following command in the terminal:

python -m ipykernel install --user --name env_name --display-name "YourName"

Imports

Please import all necessary modules first by executing the following section

import sys, os
sys.path.append(os.path.abspath("..")) # for local imports
from datetime import datetime

from LOGS import LOGS
from LOGS_solutions.GenerateStatistics.StatisticEntities.StatisticsDatasets import StatisticsDatasets
from LOGS_solutions.GenerateStatistics.StatisticEntities.StatisticsProjects import StatisticsProjects
from LOGS_solutions.GenerateStatistics.StatisticEntities.StatisticsSamples import StatisticsSamples
from LOGS_solutions.GenerateStatistics.StatisticEntities.StatisticsInstruments import StatisticsInstruments

Parameters

Please set the parameters as you like.

The following parameters are for all scripts:

  • target_path: The target path, where all statistics should be saved. Default: Within the folder containing the script, a new folder “statistics” is created in which all statistics are saved.

  • begin_date: Lowest date limit for statistics to be created. Has to be a datetime object. Statistics of datasets and instruments are filtered based on the dataset acquisition date. Statistics of projects are filtered basede on createdOn.
    Default: None (no limit)

  • end_date: Highest date limit for statistics to be created. Has to be a datetime object. Statistics of datasets and instruments are filtered based on the dataset acquisition date. Statistics of projects are filtered basede on createdOn.
    Default: None (no limit)

Specific Parameter for the statistics of datasets, projects and samples:

  • show_num_heatmap: Boolean if the number should be shown in the heatmap.
    Default: True

  • persons: List of persons for statistics. Has to be a list of the person ids. Please write 0 for “No person”. If the list is empty the statistics of all persons will be created.
    Default: []

Specific Parameter for the statistics of datasets and instruments:

  • instruments: List of instruments for statistics. Has to be a list of the instrument ids. Please write 0 for “No instrument”. If the list is empty the statistics of all instruments will be created.
    Default: []

Specific Parameter for the statistics of instruments:

  • cutoff: Only the statistics that correspond to >= the cut-off are displayed. The cutoff refers to the number of the respective entity. The cutoff must be an integer.
    Default: 0

target_path = "./statistics"  
begin_date = None # begin_date example: datetime(2024, 1, 1)
end_date = None # end_date example: datetime(2024, 2, 28)
show_num_heatmap = True
persons = []
instruments = []
cutoff = 0

Initialize class objects

Please make sure that the logs.json config file is in the same folder as the other classes and the notebook.

If the formatting of the config file is not clear, refer to the instructions for help: https://docs.logs-python.com/pages/setup.html

The different classes do the following:

  • StatisticsDatasets: How many datasets were created per time unit for LOGS group, persons and instruments

  • StatisticsProjects: How many projects were created per time unit for LOGS group, persons

  • StatisticsSamples: How many samples were created per time unit for LOGS group, persons

  • StatisticsInstruments: Which and how many experiments, projects and samples were created per instrument

logs = LOGS()
statistics_dataset = StatisticsDatasets(logs, target_path, begin_date, end_date, show_num_heatmap, persons=persons, instruments=instruments)
statistics_projects = StatisticsProjects(logs, target_path, begin_date, end_date, show_num_heatmap, persons=persons)
statistics_samples = StatisticsSamples(logs, target_path, begin_date, end_date, show_num_heatmap, persons=persons)
statistics_instruments = StatisticsInstruments(logs, target_path, begin_date, end_date, instruments=instruments, cutoff=cutoff)

Create Statistics

If you would like to create all statistics, simply complete the following section. If you only want to create one specific statistic, please comment out the others.

statistics_dataset.create_statistic()
statistics_projects.create_statistic()
statistics_samples.create_statistic()
statistics_instruments.create_statistic()