vdas_20110713
by
Marie Anne Bizouard
—
last modified
2011-07-13 21:35
- Announcements
- Current issues:
- Data transfer (Livio)
- Last call action items:
- ffl files production at CNAF and Lyon
- Db requirements file changes
- Pilot jobs : which pipeline is going to use PJ at CNAF in the next months?
- VDB latency pb
- Specific topics:
Minutes
Attendance: Stefano, Livio, Antonnella, Leone, Loic, MAB (minutes taker), Didier by email- Data transfer (Livio). VA3 data transfer between Lyon --> Bologna done. Now raw files transfer between Bologna --> Lyon. Problem to generate .ffl files because files can be on tape at CNAF. FrDump tools no more adapted to future needs. Need to store metadata useful to build .ffl files in a database. There will always be a possibility to check the integrity of a small set of files, but building a full .ffl files covering several months of data is becoming no more practical. Once transfered to complete, ffl to be built and checked.
- Last meeting action items (MAB): done nothing. Db requirements to be done by MAB in the next days!
- VDB latency pb:
- summary of the situation (MAB): from time to time, SegOnline latency to write in VDB is too long and segments are corrupted. The ASCII/xml segment files are also affected by this pb. Didier observed that each hour the VDB writing latency can be up to 150 s, but this does not generate segment corruption. Segment corruption incidents seem to be related to other user's activities (see Stefano emails and report).
- Report from Stefano about the July 1st incident:
"about the VDB problem on 1 July (drop in data writing to VDB as in the attached graph, from 9.53 to 18.20) it corresponds to an overload of mysql connections to the VDB server (pub7). From the firewall the rate of mysql connections from olserver16 to pub7 ranges from 125 conn/s to 13000 conn/s at those times. Even now there are about 270 tcp connections in TIME_WAIT state between these 2 machines. As an additional info, maybe unrelated, in the same time range there was a transfer of data from to the directories:
/data/mdc/ninja2/GAUSSIAN_2months/V1/V-V1_NINJA2_GAUSSIAN-xxxx owned by the accadia user". scp transfer between ligo and the storage farm volume using mbta1 (olnode04). olnode04 is on network interface .72. "This host of the computing farm has an interface on the ITF (.72) internal network and, if used for the transfer to a storage farm node which is on the general network (.73), causes an unnecessary duplication of the traffic going back and forth through the layer3 switch connecting the 2 networks (1 Gbps single link). It seems that the mysql communication between SegOnline on olserver16 and VDB on pub7 (public network) is particularly sensible to this overload. I suspect this is related to the number of tcp/mysql connections per second that are opened and shortly closed leaving constantly more than 200 tcp connections in TIME_WAIT state." - Didier's observations: each hour there is a ~150s peak latency in VDB latency. What's the origin?
- Stefano/Leone: this latency should correspond to a cronjob that inspects VDB connections. A priori that should not generate any trouble. Could make a test to check that if cronjob is off the 150s latency disappears.
- Network topology (from Stefano emails): "To be clear the topology is:
ITF LAN (.72) <- layer3switch -> General Computing LAN (.73) <- firewall -> Internet
| | |
olservers storage farm public net
|
VDB (pub7)
and the traffic on the cross-link can be seen in the attached graph on 1st July when there was a major latency drop
I add some notes here because it is difficult to do this in an audio call.
The traffic is routed correctly (by the way the rawdata traffic does not pass through the layer3switch), the problem is that the coupling between the 2 networks depends on the choices of the users as in this case, when a user decided to use a node on the ITF LAN to do a bulk transfer from Internet via scp to a volume on the storage farm via NFS
Duplication means that the traffic passes 2 times on the layer3switch when it wouldn't pass at all chosing an host on the general computing LAN as the transfer endpoint
These situations occur often and are facilitated by 2 facts: the ITF LAN is open to everybody, and users don't contact or signal us for potentially heavy activities that are not ordinary ( a bulk transfer, a reprocessing, etc. )
Of course the upgrade of the switch is planned for the Advanced Virgo stop, but in any case I think that if we are speaking of VDB as an on-line activity the placement, constraints should be probably reviewed. In the meantime I think that a rework to use a single permanent tcp connection to stream all the mysql queries from SegOnline could also be useful" - Discussion: there might be something not understood at the level of layer3switch (large bandwidth toward public network: > 200 Mb/s during July 1st incident). This switch is 10 yrs old. Was planned to change it before VSR4. Replacement postponed. Could solve partially the pb. Another solution would be to move VDB server in the online network (.72). Client service/web server would remain on the public network. A priori that should be possible and should protect the connection SegOnline --> VDB.
- Leone: wonders how many SegOnline connection/query opens per second?
- Didier's info: "Attached to this email, there is an updated plot of the latencies where we see
clearly that latencies in writing ascii files are decorrelated from latencies in writing in VDB.
We still see the 1 hour periodicity in latency peaks for VDB. Has anyone a clue about this? --> MAB: see answer above
Finally, one word about SegOnline: the writing in VDB is done only in the function SegFeedVDB, using one VDB command: olVDB_segments_put_nsampleDasync(&vdbh, segch[i]->dqname, OLVERSION, segch[i]->nsample, segch[i]->sampleGPS, segch[i]->sampleValue
where sampleGPS and sampleValue are arrays containing the last gps and value of the last segments produced for a given flag." - Leone: how many DQ lists?
- Didier's email: "SegOnline deals with about 100 DQ flags.
For each of them, ascii files are updated (fully rewritten) every 30s,
and the call to olVDB_segments_put_nsampleDasync
is done each time 30 new values of the flag have been received (which means also 30s
since the DAQ sends one frame per second and there is one value of a given DQ flag
in each frame). So, the usual rythm is 100 calls to olVDB_segments_put_nsampleDasync
every 30s, each time with arrays "sampleGPS" and "sampleValue" containing 30 values.
The main while loop of SegOnline is just waiting for input frame and calls SegFeedVDB
only when a new input frame has arrived, usually once per second. Then, in SegFeedVDB
there is a condition which prevent olVDB_segments_put_nsampleDasync to be called once per second."
- Leone: is that possible to limit the bandwidth of scp or ftp with the -l option?
- Leone: we should also worry of IO traffic generated when operations (COMBINATION) queries are done. That might require to access lots of segments when users ask for long stretch of data. A priori the load for VDB is limited to IO. The computation itself should not perturbate the VDB server.
- Leone would like to have a look again at SegOnline code to be sure that VDB API use is fine (previous email from Didier should answer to part of this)
Conclusion of the discussion:
Whatever the control of traffic from online to public networks, if VDB remains on the public part of the network, there will always a risk of overload. One solution is to move the VDB server in the online network partition while keeping the client services/web server on pub7. Question: should we do that before the end of the run or is it too risky?
Action item: Stefano will check with Giuseppe if this is possible to do such an intervention during a maintenance period in the next weeks.
Leone mentions that it could mean an interruption of the client services (web server and user queries with VDBtk) for 1 or 2 days. Users should be warned.
Didier must be available to be sure that SegOnline service is back online and working fine.