The ROOT Streamer
Sometimes, mails like
`Dear experts,
my task worked fine <enter_date> but now
on grid it does not. can you take a look ? '
are sent on the mailing list. Some unexpected errors are caused by misuse of the ROOT streamer. The streamer has been mentioned a couple of times before now in other sections. So what is the ‘streamer to begin with?
- The ROOT streamer decomposes an object into its data members (serialization) and writes it to disk
This sounds more confusing than it is. Take a look at our class example from section 1
class Polygon
{
private:
int width, height;
}
The job of the streamer here would be, to save values of height and width, so that they can be written to disk.
The automatically generated streamer
ROOT automatically generates a steamer for us, if we instruct it to do so. Remember from the earlier sections, that we put the following lines in our task:
// tell ROOT the class version
ClassDef(AliAnalysisMyTask,1);
};
By doing this, a streamer is constructed for us by invoking the ClassDef
macro.
Customization
ROOT generates a streamer for us, but we have the task of customizing it. Customization means defining what data members should and should not be touched by the streamer. Customization is done in the header *.h) of a class, by giving specific instructions in the comments written after the declaration of data members (i.e. ‘//’). Let's take a look at this in practice.
Customizing: ‘member persistence’
Persistent Members (//) are ‘streamed’ (copied/stored)
class SomeClass : public TObject { private: Int_t fMinEta; // min eta value Int_t fMaxEta; // max eta value ...
Transient Members (//!) are not streamed
class SomeClass : public TObject { ... AliAODEvent* fAOD; //! current event TH1F* fHistPt; //! our histogram
So if we would write the object defined by the code snippets above to disk, the value of fMinEta
and fMaxEta
will be saved (//), but the values of fAOD
and fHistPt
will be ignored (//!).
Advanced use cases
We can customize the streamer in more sophisticated ways
The Pointer to Objects (//-$>$) calls streamer of the object directly
class SomeClass : public TObject { private: TClonesArray *fTracks; //-> ...
Variable Length Array, (so that not just the pointer is copied)
class SomeClass : public TObject { Int_t fNvertex; ... Float_t *fClosestDistance; //[fNvertex]
All this is explained in detail in 11.3 of the ROOT documentation https://root.cern.ch/root/htmldoc/guides/users-guide/ROOTUsersGuide.html
Streamers and doxygen
In the second subsection of this tutorial, you saw that the streamer directives looked a bit different. This is, because we sometimes use member comments in header files for yet another purpose: creating automatic code documentation with Doxygen. Doxygen is a way of documenting your code inside the source code itself, by factoring special comments that will be ignored by the C++ compiler.
Doxygen uses both //!< and ///< to introduce a data member comment. Until ROOT 5, this allowed using //!< for indicating a transient data member. ROOT 6 abruptly broke this compatibility and //!< does not mean transient anymore: we are now forced to use //!<! for introducing a Doxygen comment interpreted as transient in both ROOT 5 and ROOT 6.
Pitfalls
We started this section out, by saying that the streamer definition can be a source of confusion. Why is this so?
Let's start by realizing when the streamer definition is actually relevant: this is when objects are written to disk or copied. If you run your analysis locally, your analysis task is (probably) never copied or stored to disk, ergo the streamer information is never used and specifying // or //! are equivalent
If you run on Grid, or on the LEGO train system, the situation is different, and the following happens
- Your task is initialized locally and added to the analysis manager
- The manager is stored locally in a .root file
- This file is copied to grid nodes, invoking the streamer
- At the grid node, a fresh instance of your task is created
- Non-persistent members are initialized to their default values found in the empty I/O constructor (remember that we said in Section 3: you always need to specify an empty I/O constructor for your classes!)
- Persistent members are read from the .root file
Example 1 - 'unexpected' behavior
Take a look at the class below
class SomeClass : public TObject
Int_t fAbsEta; //!
// setter
SetAbsEta(Float_t eta) {fAbsEta = eta;}
// ROOT IO constructor
SomeClass::SomeClass() :
fAbsEta(0) {;}
In your steering macro, you call
SomeClass->SetAbsEta(3);
You now launch your analysis to run on the Grid. What is the value of fAbsEta
when the analysis starts to run at a Grid node?
Click to show answerClick to expand
Example 2 - expected behavior?
Now take a look at a second, very similar class:
class SomeClass : public TObject
Int_t fAbsEta; //
// setter
SetAbsEta(Float_t eta) {fAbsEta = eta;}
// ROOT IO constructor
SomeClass::SomeClass() :
fAbsEta(0) {;}
Again, we call
SomeClass->SetAbsEta(3);
And launch our analysis task on the Grid. What is the value of fAbsEta
when your task starts to run on a Grid node?
Click to show answerClick to expand
So you see, that the streamer can have quite an important effect on your analysis!
Runtime
When you launch your analysis to Grid, it's important to realize that some methods of your analysis task are called on your laptop when you execute your steering macro, whereas other methods are only called on the Grid nodes. The execution on Grid is what we call runtime
.
Looking at the functions of our analysis task, we can e.g. say that
// constructor: called locally
AliAnalysisMyTask(const char*);
// function called once at RUNTIME
virtual void UserCreateOutputObjects();
// functions for each event at RUNTIME
virtual void UserExec(Option_t*);
It makes no sense to make data members that are initialized at runtime persistent. Therefore, as a rule-of-thumb, for all your output histograms, use the flag for non-persistence, i.e.
TH1F* fHistPt; //! pt histo
Automatic Schema Evolution
If you develop your code, the layout of persistent members of your class probably changes, i.e.
class Polygon
{
private:
int width;
}
which would take up 4 bytes, could change to
class Polygon
{
private:
int width, height;
}
which would take up 8 bytes when written to disk. The ClassDef value of your class bookkeeps this evolution for ROOT and avoids compatibility problems such as
`The StreamerInfo of class AliAnalysisTaskMyTask
has the same version (=1) as the active class but a
different checksum. Do not try to write, ...,
the files will not be readable.'
If you commit your code to AliPhysics (or AliROOT), you should make sure to increment the ClassDef value when you change the list of persistent members, i.e. if we have a class
class SomeClass : public TObject {
private:
Int_t fAbsEta; // min eta value
...
ClassDef(SomeClass, 1);
and we add a persistent member, we have to increment to ClassDef value
class SomeClass : public TObject {
private:
Int_t fMinEta; // min eta value
Int_t fMaxEta; // max eta value
...
ClassDef(SomeClass, 2);
If the list of non-persistent members changes, e.g. when we change
class SomeClass : public TObject {
private:
Int_t fAbsEta; //! min eta value
...
ClassDef(SomeClass, 1);
into
class SomeClass : public TObject {
private:
Int_t fMinEta; //! min eta value
Int_t fMaxEta; //! max eta value
...
ClassDef(SomeClass, 1);
we do not have to increment the ClassDef value, because the streamer definition of the class does not change.