Personal tools
You are here: Home Virgo Data Analysis VDAS vdas_20130822
Navigation
Log in


Forgot your username or password?
 
Document Actions

vdas_20130822

by Marie Anne Bizouard last modified 2013-09-10 22:28

VDAS

1. Announcements

2. procdata
- summer problem [Stefano, Giuseppe, Livio]
- back up strategy [see image from Livio]
- status (quotas)

3. AOB

Attendance: Antonella, Stefano, Livio, Giuseppe, Loic, Didier, Florent, MAB

Minutes:
1. procdata summer problems: 2 problems
- Stefano: main file server issues (NFS hanging). storage array port failure --> fileserver weird state. Needed to stop some servers. But wait for the end of ER4 to perform maintenance with companies after ER.

- Livio: st7 (procdata) pbs happen several times. Server was rebooting. Suspect software layer pb related to the disk redundancy check (still beta version in the LINUX kernel). Performed tests during the summer. But could not make extensive tests. Just disabled what is not needed. Suspect pb caused by drivers. Machine is now monitored. No more pbs for ~3 weeks.

MAB asked Stefano/Giuseppe/Livio to send emails as quick as possible during ERs when problems on disk/file server are detected making it clear to everyone that online servers accessing disks can be affected. Then, during ERs it is the responsability of process owners to check the status/behavior of their processes.
Stefano also proposes to use the logbook to spread the information.

2. Backup strategy
Livio: Tape library backup: software can decide which folder/files are backed-up. 2 ways: snapshot or keep history. Back-up frequency.
--> identify files that REALLY need back-up, in which mode (crash recovery or historical archive),

Didier: procdata was used to be very reliable so far. Maybe no need to back-up. Would the new procdata storage system as reliable as before? would the full 62TB be affected? 
Livio: procdata full volume is hosted by 1 server only (no more any partition). A crash of the server could affect any files in the whole volume.

Antonnela: need to know which backup methods users want (crash recovery or historical recovery)

3. Status
- detchar: need Elena's inputs.
- quota: missing numbers. Need to update the table. Quota should be understood as a way to avoid a crazy activity to prevent others to work properly. So values should be high enough to avoid to be always at the limit.

Didier: all folders at 30 TB?
MAB: then quotas are not usefull. There are folders that we know will remain small and should remain small.

Action item: MAB will send an email to complete the information collection. Will also ask for back-up needs with all information.

AOB: next week in 2 weeks