FAQS - Frequently Asked QuestionsQ: Is a financial contribution from the users required for utilization of METACentrum capacities?
A: The utilization of METACentrum capacities is free of charge. The only requirement to be fulfilled is to be a member of academic sphere of the Czech Republic (more information can be found in section Getting an account).
Q: How to obtain an account in METACentrum and subsequently obtain access to METACentrum machines?
A: An account in METACentrum can be obtained by filling in an application form in section
"Getting an account" -> "Registration form". A PDF file will be generated after filling the
form. This PDF file has to be printed out, signed and send to a contact person listed in section
"Getting an account" -> "Contact persons" (according to affiliation to home AFS cell).
Q: How to obtain access to further machines if I already have an account in METACentrum?
A: If you have already a functional METACentrum account, it is sufficient to only ask for obtaining of other accounts in section "My account" -> "Additional accounts". It can be required printing out and signing of a special form in the case of access to a machine with special access policy.
A: The form can be seen repeatedly in section "Getting an account" -> "Registration state checking".
Q: What does the "Accounts activation" mean?
A: If you know in advance that you will not use some of your accounts at a specific METACentrum machines from any reasons for a long time (foreign stay, change to different architecture etc.), we offer a possibility to deactivate such accounts temporarily (e.g. to prevent their abuse). "Accounts activation" is reverse process i.e. activation of deactivated accounts. Activation/deactivation has no relation to possible expiration due to non-renewal of METACentrum account.
Q: Why is it necessary to renew account every calendar year?
A: Because by the renewal users demonstrate their own activity performed using available METACentrum resources. Moreover, it is necessary for the user to go through the web form and confirm that: "I confirm that my personal information in METACentrum database are actual, that I am informed about actual version of user proclamation and I do not know about any fact that would hinder me from accessing of technical resources of METACentrum." By this step we are trying to partially eliminate non-active users or users who do not utilize METACentrum resources in harmony with their planned mission. The METACentrum workers reserve the right to use submitted information during the preparation of actual METACentrum yearbook.
A: If you remember your password, ask for accounts renewal in a standard way. If you forgot your password, you have to personally visit arbitrary contact place (together with proper proving of your identity) and have to ask for setting a new password. Subsequently you can ask for account renewal.
Q: How does the mechanism of an account renewal work?
A: All METACentrum users are asked to renew their accounts at the end of a calendar year. This can be easily done in section "Annual report and extension of accounts" by selection of a period for which the account should be renewed and by submitting a report containing information for what purposes the user utilized METACentrum resources during a specific calendar year.
Q: Do I have to submit an annual report?
A: Of course, you do not. But if you do not submit the annual report you are going to get rid of possibility to use computational capacities of METACentrum.
Q: What will happen if I do not submit an annual report?
A: If you do not submit an annual report, all your active accounts will be blocked. Such accounts can be unblocked in the future but only in the case that you submit an annual report from the last year of your active working in METACentrum.
Q: Do I have an account in METACentrum forever?
A: The existence of your METACentrum account is limited only by your affiliation to the academic sphere. You have only to take care of the on time METACentrum account renewal (before the end of a calendar year) and of the submission of an annual report from your past calendar year. If you haveasked for an METACentrum account during your life already and you have obtained the account, it is probable that your account can be renewed even after a long period of non-activity.
Q: Why I can not have account at machines loslab, wood and quark?
A: Cluster loslab was acquired from resources of research centre Loschmidt Laboratories and therefore these resources are accessible only by researchers from this centre. Similar situation is for cluster wood that is dedicated to researchers of Department of Wood Science of MZLU and The Faculty of Electrical Engineering and Communication of VUT. Cluster quark is dedicated to video proceeding at MU. Due to these reasons it is not allowed to make accounts for common METACentrum users at these machines.
Q: What to do if I forget/lose a password to METACentrum resources?
A: Please visit the page My account - Change password and follow the instructions.
A: No. Password cannot be changed based on e-mail request because METACentrum administrators have no guarantee that the e-mail comes from the same person who properly submitted and signed registration form, fulfills requirements for account granting, is authorized to use our resources and who pledged to respect our security and operational rules. Moreover the password could be eavesdropped during the transfer.
You can enter a new password as a part of the Password change request that can be requested from the page My account - Change password. We will try to verify the request by calling over telephone to the phone number specified in the orginal user's details, or to a number acquired from some independent location like official web pages of the user's institution.
If it is not possible to verify the request, you have to visit the closest METACentrum contact place and after proving of your identity to set a new password. You can consider this as unnecessary bothering but, please, take into account that in METACentrum you obtain access to the technical equipment in price of tens of millions of CZK and therefore the acces has to be secured properly.
Q: What is the difference between AFS password and password/passwords to individual machines?
A: There is no difference. We use system Kerberos for authentication to all resources in METACentrum, i.e. AFS, METACentrum portal as well as accounts at individual machines.
Q: How to change a password for access to METACentrum resources?
A: Using the command kpasswd at arbitrary machine of METACentrum.
Q: What does the authentication by principal @META mean exactly and how does it work?
A: Principal is user identity in Kerberos system, META is realm (name space) of Kerberos system used in METACentrum. When you enter the password during log in process or using a command kauth a ticket with limited time validity (standard 10 hours) is created. This ticket serves for access to further resources without necessity to repeatedly enter the password.
Q: Why there is just queue preemptible at orca cluster?
A: Cluster orca was bought using NCBR resources and it was involved in METACentrum in a special regime - the owners can submit any jobs through queue orca, the rest of METACentrum can submit only in queue preemptible. Queue preemptible is similar to the queue long, but its jobs can be preempted
Q: Why it is not possible to obtain account at cluster loslab?
A: Cluster perian was bought from project Loschmitd Laboratories and it is accessible strictly only for NCBR workers. If anybody who is not involved in corresponding group would like to use it, an authorization from the resource owner (administrator) of this cluster
Q: Why there is property long for skirit when it is simultaneously also a type of a queue? Why queues short/normal/long are not at all machines?A: Queues long/normal can access all normal nodes (i.e. CAN NOT access nodes with special purpose, dedicated to projects, bought from a specific organization resources ... e.g. nodes perian, quark). The queue long can access only nodes with the property long - part of cluster hydra, part of cluster skirit and all nodes of cluster nympha.
Q: Why it is not possible to connect to server skirit.ics.muni.cz using SSH, WinSCP respectively?
A: If you use client PuTTy, your configuration is not compatible with settings of our SSH servers. Please, change your client settings accordingly: the prefered SSH protocol version to 2 (instead of 1), see item 'Preferred SSH protocol version' at Connection->SSH tab and also activate item 'Attempt "keyboard-interactive" auth (SSH-2)' at Connection->SSH->Auth tab.
Q:I have not found SCRATCH and STORAGE directories in the skurut machine. How can I utilize this machine?A: You are not allowed to compute on the machines skurut, skirit, nympha and so on, this machines are frontend of clusters. That is why there are no /scratch directories. Machines with numbers like skurut34, skirit48-1, etc. are used for computing. There are /scratch directories on the nodes of cluster skurut (machines skurut33, skurut34, skurut35 etc.), but there are not directories /storage, because this machines do not have feature "nfsv4" in PBS. To utilize this machines use /scratch.
A:Yes, it is. Your results can be copied into /home by command in the task script automatically. If you PC run permanently with server software (e.g. SFTP server which can be run in Windows too) and has very good connection, your results would be sent directly there, but I don't recommend it in case of large files. MetaCentrum rules says that you should have the data in /scratch just during task proceeding. So, it is not possible for a task to leave the data at /scratch after it finish and after the job is done copy manually the data from /scratch. Realize that after end of your task, new task will be run at the same machine which maybe need space in /scratch.
Q: Is there something like cleaning /storage and /scratch directories auomatically? If so, how often? According which rules?A:There is no automatic cleaning of /scratch and /storage. Data in /scratch must be removed by task itself. In case it doesn't do it, we delete users data.
Q: Is there any problem during copying output to /home in case the computing run at SKURUT33 machine and the computing started from /home at SKIRIT machine?A:Yes, there is problem, because /home at SKURUT is in Prague but /home at SKIRIT is in Brno. There are absolutely different directories.
A: Described problem is related to the fact that your script for PBS system originated in competitive operational system (DOS/Windows) which treates line ends of a text file differently compared to Unix/Linux. Users are allowed to convert their files from DOS/Windows before submission using command "dos2unix" ("fromdos").
A: Input files/directories are copied using stagein in the moment when the job is being executed at a specific node (there is a standard data transfer using scp before/after job execution). I.e. if you change the input when the job is queued, the job should be computed with new input (with obvious premise that the input files have the same names).
A: There is a way using command qsub -l nodes=MACHINE_NAME:ppn=NUMBER_OF_PROCESSORS It would it be more useful do not select concrete machine but arbitrary machine with 64-bit system (architecture AMD64 or EM64T), then it is better to require in qsub command property "amd64".
A: A valid Kerberos ticket is required for successful submission of a computational job into PBS batch system. Commonly an user obtains Kerberos ticket automatically after successful login to METACentrum machine. A ticket is valid for 10 hours. If the Kerberos ticket expires and the user wants to submit a job into the system the mentioned error message will appear. This can be solved by renewal of the ticket validity by command "kauth" or by new login. Current state of your Kerberos tickets can be checked using command "klist". More information can be found in Kerberos documentation in section Documentation -> Environment -> Kerberos.
A:Add parameter "-r n" to a command qsub.
A:Add parameter "-a" to a command qsub ( -a date_time Declares the time after which the job is eligible for execution).
A:In the beginning of task script, which you submit to PBS, is good to change current directory according your needs. Otherwise your job runs in current directory of cluster, not in directory, which was actual during running qsub. (For a link to this directory from task script is possible to use a variable PBS_O_WORKDIR, if job runs on the same cluster as qsub.) It is possible to have the input/output files copied automatically to/from a computational node through parameters "-W stagein=..." and "-W stageout=..." command qsub. It is good to use AFS directory (~/shared), which is slower( particulary not so good for temporary and huge files), but visible from all Metacentrum nodes.
A:Your sub-directory in /scratch (i.e: /scratch/you_user_name) is mentioned. (You have this directory just in powerful nodes, not in leading node of cluster ( skirit.ics.muni.cz etc.)). Task script could look like this. e.g:
mkdir /scratch/you_user_name/$PBS_JOBID
cd /scratch/you_user_name/$PBS_JOBID
cp ~/shared/.../input* .
#nebo scp submitting_node:/home/you_user_name/.../input* .
#computation
cp output* ~/shared/.../
# nebo scp input* submitting_node:/home/you_user_name/.../
# cleaning:
cd /scratch/ you_user_name
rm -rf $PBS_JOBID
A:PBS hides task script and settings of variable during submitting task. The script is executed after its initiating, everything else (called programs, files for stagein etc.) is used actual. In case of binnary or other difficulty structured files change is important to focus on influencing currently running task - it is good to rename a file, runnig task could use older version without danger.
cc -o program.version2 ....
mv program program.version1 && mv program.version2 program
or throught symbolic links, which linking to particular version
cc -o program.version1 ...
ln -s program.version1 program
# running
cc -o program.version2 ...
rm program && ln -s program.version2 program
Q:I want to be informed via e-mail in case of job terminanting. What should I do?
A:Add parameter "-m ae" to command qsub. All details of command qsub you can find in qsub manual page "man qsub".
A: You can see your jobs at portal and you can see there used machines and processors for each job. Then you can log in them through ssh (Putty) and check your jobs as you want.
A:If it is computing interuption because of limit exceeding, you can handle it by setting the task to respond to SIGTERM signal and remove the files in this case.
The catching of signals is defined in shell script by command trap, e.g.
trap "rm -rf /scratch/$USER/$PBS_JOBID" SIGTERM
It will work also in case of interuption of computing by command "qdel".
Cases when the crash of job is made by serious system error (e.g. blackout) can not be handled automatically.