I got back in the office this morning and looked at the Host List from within qubic and could
see that a couple of our test nodes seemed to have gone down. I tried to restart the qubeworker service but
got the following error message:
---------------------------------------------------
Could not start the qubeworker on Local Computer
Error 1: Incorrect function
---------------------------------------------------
Upon further investigation into the worker log on "RB-01" I could see the following:
Exec Database Error: no such table: assignment(1)
SELECT job_id, subjob_id FROM assignment
Exec Database Error: no such table: variables(1)
SELECT name, value FROM variables
Exec Database Error: no such table: resources(1)
SELECT fullname FROM resources
Exec Database Error: no such table: properties(1)
SELECT fullname FROM properties
Exec Database Error: no such table: assignment(1)
SELECT job_id, job_pid, job_serverid, job_pgrp, job_password, job_cluster, job_priority, job_globalorder, job_localorder, job_user, job_domain, job_name, job_label, job_reservations, job_groups, job_hosts, job_hostorder, job_cpus, job_restrictions, job_requirements, job_status, job_subjobstatus, job_agendastatus, job_data, job_prototype, job_kind, job_path, job_logpath, job_prototypepath, job_todo, job_lastupdate, job_timesubmit, job_timestart, job_timecomplete, job_flags, job_account, job_env, job_reason, job_timeout, subjob_id, subjob_status, subjob_data, subjob_result, subjob_count, subjob_retry, subjob_seq, server_address, procid, trid, verified, outpos, errpos, orders, timestart, started, missing, sid, jobstats_jobid, jobstats_subid, jobstats_threads, jobstats_start, jobstats_end, jobstats_maxmemory, jobstats_maxswap, jobstats_host FROM assignment
Exec Database Error: no such table: variables(1)
SELECT name, value FROM variables
Exec Database Error: no such table: locks(1)
SELECT fullname FROM locks
Exec Database Error: no such table: assignment(1)
SELECT job_reservations FROM assignment
Exec Database Error: no such table: assignment(1)
SELECT job_id, job_pid, job_serverid, job_pgrp, job_password, job_cluster, job_priority, job_globalorder, job_localorder, job_user, job_domain, job_name, job_label, job_reservations, job_groups, job_hosts, job_hostorder, job_cpus, job_restrictions, job_requirements, job_status, job_subjobstatus, job_agendastatus, job_data, job_prototype, job_kind, job_path, job_logpath, job_prototypepath, job_todo, job_lastupdate, job_timesubmit, job_timestart, job_timecomplete, job_flags, job_account, job_env, job_reason, job_timeout, subjob_id, subjob_status, subjob_data, subjob_result, subjob_count, subjob_retry, subjob_seq, server_address, procid, trid, verified, outpos, errpos, orders, timestart, started, missing, sid, jobstats_jobid, jobstats_subid, jobstats_threads, jobstats_start, jobstats_end, jobstats_maxmemory, jobstats_maxswap, jobstats_host FROM assignment
Exec Database Error: no such table: locks(1)
Seems to me as if the mySQL table has gone screwy?
My solution was to run the "upgrader_worker -reset" command and that seem to take care of the problem.
However, this is a bit worrying because it hasn't just happened once but on a few other nodes as well on various occasions.
What could be causing this corruption?
For you information, our test supervisor runs Windows XP Professional 64bit and our test nodes run on Windows 2000 for now.
Thanks!
Nikos