You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Parsl can store information about workflow execution into an `SQLite database <https://www.sqlite.org/>`_. Then you can look at the information, in a few different ways.
7
8
8
9
.. index:: monitoring; configuration
10
+
MonitoringHub
9
11
10
-
turning on monitoring
12
+
Turning on monitoring
11
13
=====================
12
14
13
-
.. todo:: this section should show a simple configuration
15
+
Here's the workflow used in `taskpath`, but with monitoring turned on:
Compared to the earlier version, the changes are adding ``monitoring=`` parameter to the Parsl configuration, and adding an additional app ``twice`` to make the workflow a bit more interesting.
36
+
37
+
After running this, you should see a new file, ``runinfo/monitoring.db``:
38
+
39
+
.. code-block::
40
+
41
+
$ ls runinfo/
42
+
000
43
+
monitoring.db
44
+
45
+
This new file is an SQLite database shared between all workflow runs that use the same ``runinfo/`` directory.
46
+
47
+
Using monitoring information
48
+
============================
49
+
50
+
There are two main approaches to looking at the monitoring database: the prototype ``parsl-visualize`` tool, and Python data analysis.
17
51
18
52
.. index:: parsl-visualize
19
53
monitoring; parsl-visualize
20
54
21
55
parsl-visualize web UI
22
56
----------------------
23
57
24
-
Parsl comes with a prototype visualizer for the monitoring database.
58
+
Parsl comes with a prototype browser-based visualizer for the monitoring database.
59
+
60
+
Start it like this, and then point your browser at the given URL.
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
68
+
* Running on http://127.0.0.1:8080
69
+
Press CTRL+C to quit
70
+
71
+
72
+
73
+
Here's a screenshot, showing the above two-task workflow spending most of its 5 second run with the ``add`` task in ``launched`` state (waiting for a worker to be ready to run it), and the ``twice`` task in ``pending`` state (waiting for the ``add`` task to complete).
74
+
75
+
.. image:: monitoring_wf.png
76
+
:width:400
77
+
:alt:browser screenshot with some workflow statistics and two coloured bars for task progress
25
78
26
-
Here's a screenshot:
79
+
I'm not going to go further into ``parsl-visualize`` but you can run your own workflows and click around to explore.
27
80
28
-
.. todo:: this should be a couple of screenshot and not much else
81
+
.. index:: pandas
82
+
monitoring; pandas
83
+
library; pandas
29
84
30
-
programmatic access
31
-
-------------------
85
+
Using data frames
86
+
-----------------
32
87
33
-
I usually use SQL, but Parsl users are usually more familiar with data processing in Python: you can load the database tables into Pandas data frames and do data frame stuff there.
88
+
A different approach preferred by many data-literate Parsl users is to treat monitoring data like any other Pythondata, using Pandas.
34
89
35
90
.. todo:: one example of non-plot (count tasks?)
36
91
@@ -47,23 +102,23 @@ The monitoring database SQL schema is defined using SQLAlchemy's ORM model at:
.. warning:: and the schema is defined again at https://github.com/Parsl/parsl/blob/3f2bf1865eea16cc44d6b7f8938a1ae1781c61fd/parsl/monitoring/visualization/models.py#L12 -- see issue https://github.com/Parsl/parsl/issues/2266
105
+
.. warning:: The schema is defined a second time in `parsl/monitoring/visualization/models.py line 12 onwards <https://github.com/Parsl/parsl/blob/3f2bf1865eea16cc44d6b7f8938a1ae1781c61fd/parsl/monitoring/visualization/models.py#L12>`_. See `issue #2266 <https://github.com/Parsl/parsl/issues/2266>`_ for more discussion.
51
106
52
107
These tables are defined:
53
108
54
109
.. todo:: the core task-related tables can get a hierarchical diagram workflow/task/try+state/resource
55
110
56
-
* workflow - each workflow run gets a row in this table. A workflow run is one call to ``parsl.load()`` with monitoring enabled, and everything that happens inside that initialized Parsl instance.
111
+
* ``workflow`` - each workflow run gets a row in this table. A workflow run is one call to ``parsl.load()`` with monitoring enabled, and everything that happens inside that initialized Parsl instance.
57
112
58
-
* task - each task (so each invocation of a decorated app) gets a row in this table
113
+
* ``task`` - each task (so each invocation of a decorated app) gets a row in this table
59
114
60
-
* try - if/when Parsl tries to execute a task, the try will get a row in this table. As mentioned in `elaborating`, there might not be any tries, or there might be many tries.
115
+
* ``try`` - if/when Parsl tries to execute a task, the try will get a row in this table. As mentioned in `elaborating`, there might not be any tries, or there might be many tries.
61
116
62
-
* status - this records the changes of task status, which include changes known on the submit side (in ``TaskRecord``) and changes which are not otherwise known to the submit side: when a task starts and ends running on a worker. You'll see ``running`` and ``running_ended`` states in this table which will never appear in the ``TaskRecord``. One ``task`` row may have many ``status`` rows.
117
+
* ``status`` - this records the changes of task status, which include changes known on the submit side (in ``TaskRecord``) and changes which are not otherwise known to the submit side: when a task starts and ends running on a worker. You'll see ``running`` and ``running_ended`` states in this table which will never appear in the ``TaskRecord``. One ``task`` row may have many ``status`` rows.
63
118
64
-
* resource - if Parsl resource monitoring is turned on (TODO: how?), a sub-mode of Parsl monitoring in general, then a resource monitor process will be placed alongside the task (see `elaborating`) which will report things like CPU time and memory usage periodically. Those reports will be stored in the resource table. So a try of a task may have many resource table rows.
119
+
* ``resource`` - if Parsl resource monitoring is turned on (TODO: how?), a sub-mode of Parsl monitoring in general, then a resource monitor process will be placed alongside the task (see `elaborating`) which will report things like CPU time and memory usage periodically. Those reports will be stored in the resource table. So a try of a task may have many resource table rows.
65
120
66
-
* block - when the scaling code starts or ends a block, or asks for status of a block, it stores any changes into this table. If enough monitoring is turned on, the block where a try runs will be stored in the relevant ``try`` table row.
121
+
* ``block`` - when the scaling code starts or ends a block, or asks for status of a block, it stores any changes into this table. If enough monitoring is turned on, the block where a try runs will be stored in the relevant ``try`` table row.
67
122
68
-
* node - this one is populated with information about connected worker pools with htex (and not at all with other executors), populated by the interchange when a pool registers or when it changes status (disconnects, is set to holding, etc)
123
+
* ``node`` - this one is populated with information about connected worker pools with htex (and not at all with other executors), populated by the interchange when a pool registers or when it changes status (disconnects, is set to holding, etc)
0 commit comments