-
Notifications
You must be signed in to change notification settings - Fork 2.6k
rework jobs #403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rework jobs #403
Conversation
Can you please rebase against current |
needed to comment out async methods and associated tests.
qiskit/backends/basejob.py
Outdated
def __init__(self): | ||
"""Initializes and initates the asynchronous job""" | ||
pass | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also have methods to get back job_id
and backend
that runs the job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case we really need to expose the ID, maybe a better place could be our own implementation (IBMQJob) and not pollute the BaseJob class. If we find out that third-party implementers also need to expose the ID, then yeah, let's move it to the BaseJob class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think it's important that we try to keep the BaseJob
as generic as possible - during our meeting, my mention of including id
as a reachable property was aimed at the IBMQ*
level, and provided that we had a clear current use case were users needed access to it (which I thought we did - but actually, it would be great to have a solid example of it).
Can we keep BaseJob
id-agnostic as possible, following the idea that it should be as less restrictive and away from implementation details as possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this sounds fine to me. Currently the spec says the ID is reported in the status
dictionary. I suppose IBMQJob could add an additional method to have id
or get_id
but am somewhat concerned about adding methods which because of their common usage, are assumed to exist for all backends.
qiskit/backends/ibmq/ibmqbackend.py
Outdated
IBMQJob (BaseJob) | ||
""" | ||
return IBMQJob(self.run_job, q_job, self._api) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also add a method get_jobs()
to return all jobs for this backend (for the authenticated user)?
qiskit/_quantumprogram.py
Outdated
# self._run_internal(qobj_list, | ||
# wait=wait, | ||
# timeout=timeout, | ||
# callback=callback) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just remove the code, thanks to git history there's no need to comment code anymore :)
qiskit/backends/basejob.py
Outdated
def __init__(self): | ||
"""Initializes and initates the asynchronous job""" | ||
pass | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case we really need to expose the ID, maybe a better place could be our own implementation (IBMQJob) and not pollute the BaseJob class. If we find out that third-party implementers also need to expose the ID, then yeah, let's move it to the BaseJob class.
qiskit/backends/basejob.py
Outdated
pass | ||
|
||
@abstractmethod | ||
def running(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the status()
method enough to know whether the job is running or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Futures implements running
so it is simple to propagate. Currently I populate status with that method.
qiskit/backends/ibmq/ibmqjob.py
Outdated
def running(self): | ||
return self._future.running() | ||
|
||
def add_done_callback(self, fn): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm that might be useful for local jobs... but moving to a future
based async model may render this method useless. Do we have a use-case for this method?
qiskit/backends/local/localjob.py
Outdated
def running(self): | ||
return self._future.running() | ||
|
||
def add_done_callback(self, fn): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above: Do we really need this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it's best to keep it simple for now and not use it. There was once a case where I wanted to time individual jobs on a simulator but there are other ways.
This is great! I love moving from callback based async approach to a future based async approach!! |
If we have a |
I think instance, since these are qiskit components talking to each other. |
@atilag The use case for a job id is if the user wants to retrieve a job in the future and this would be a backend unique identifier for that job. Also this will go into the results data structure and so be a way for users to link results back to a specific execution. |
@atilag The use case for a job id is if the user wants to retrieve a job in the future and this would be a backend unique identifier for that job. Also this will go into the results data structure and so be a way for users to link results back to a specific execution. (reply to this one) Is is too much of a constraint from a general backend to provide a unique id? It doesn't seem like it would be. |
@ajavadia If we have a |
@ewinston i think so. this is similar to *(btw i think that method should be |
@ajavadia @ewinston If I think there should be some string type id to uniquely label all backends and jobs separate from the apparatus of objects (?), but in particular for jobs. |
@dcmckayibm I think returning objects avoids the overhead of translating between string names and instances by the user. Objects are easier to work with. You get it and call 1- For backends, it is safe to assume there aren't hundreds of them. So if you look at the current implementation, the list of backend instances is created at the time of provider authentication, and maintained thereafter (the API call to 2- You might be right that for jobs, we may not want to carry this list of job instances through. In that case: |
@ajavadia I guess I'm more thinking from a data structures perspective. Consider I have a Maybe the thing I'm advocating isn't' necessarily that |
Sure, but for me the
So let's say that the job was not ready and we have serialized it to disk, we can load it from disk again and rebuild an instance of our Job instance, by deserializing:
The benefits of this approach are:
I'd say that you are right, and sounds legit to me, but I wouldn't like to make assumptions without having any other experience on backend implementation aside from our own backends. As time is a constraint, and we don't have a serializer/deserializer in place, my suggestion would be to move the |
qiskit/backends/basejob.py
Outdated
def status(self): | ||
""" | ||
Returns: | ||
str: 'ERROR', 'QUEUED', 'RUNNING', 'CANCELLED', or 'DONE' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we are at it, can we introduce something a bit more robust than plain strings, such as Enum
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the spec describes status as a dictionary which I'm changing this to. The status dictionary looks like,
@property
@abstractmethod
def status(self):
"""
Returns:
dict: {'job_id': <job_id>,
'status': 'ERROR'|'QUEUED'|'RUNNING'|'CANCELLED'|'DONE',
'status_msg': <str>
"""
pass
I suppose that status
key could be Enum
however that could be a hassle if we need to serialize it at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enums
have some advantages over strings in this kind of contexts:
- It's safer and less prone to user typing errors.
- IDEs usually have a proper support.
- If there's a need to change the value of the
Enum
, you just need to change it in one place. - Less memory usage (not a big deal for this class, though)
I wouldn't worry to much about the serialization, depending on the type of serialization, the Enum
will turn into a string or a binary value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
status
is an enum
in the schema (as of now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@atilag the thing about the job_id
is that the physical thing that runs and the result it produces are not necessarily a part of qiskit
. Technically some one could use a different program to run a job and how would we be able to grab the results (or vice-versa)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The majority of users will never leave the qiskit ecosystem so they should be abstracted away from job_id's as much as possible, but I just want it exposed for these edge cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dcmckayibm Doesn't exposing it in just ibmq jobs satisfy the edge case but also not force people who don't need it from being forced to implement it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any remote job will need a job_id, maybe not local simulators (but I could still think of cases, e.g., local simulators with threading that could make use of a job_id)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to leave it out of the qiskit architecture that could be ok, but it will need to be in the result data structure that is returned by the job
@ewinston Thanks Erick, I think this is very close to being done. I only have a couple more comments:
|
- added async test - added backend_name method to IBMQJob and LocalJob - cleaned unused timeouts for QuantumJob - added queued method to IBMQJob and LocalJob
Awesome. It looks ready to merge to me! |
@ewinston: just a minor thing: when i submit a job to ibmqx4, and monitor Also my queue position seems to be constant. I don't know if you see a similar thing. |
@ajavadia The temporary status of When I monitored the queue position of a job it seemed to be not changing but was actually changing really slowly. Also, I'm working on another branch of this one which allows submitting ibmq jobs faster. It's on my fork on "ibmq_future" branch. What caveat of it is that an ibmq job can briefly not have a job id. |
@ewinston I think maybe None is supposed to be RUNNING actually? Because I just see QUEUED, None and COMPLETED. Here is one output I just got... I query the status every 3 seconds:
|
qiskit/backends/ibmq/ibmqjob.py
Outdated
(pprint.pformat(job_result))) | ||
elif job_result['status'] == 'RUNNING': | ||
# we may have some other information here | ||
if 'infoQueue' in job_result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the fix to my comment above is just to add
_status = JobStatus.RUNNING
in the if
here too. Right now it is only in the else
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - that seems to be a needed fix! 👍
test/python/test_ibmqjob.py
Outdated
cls._qc = qc | ||
cls._provider = IBMQProvider(QE_TOKEN, QE_URL) | ||
|
||
def test_run(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add another test_run_device
(or similar name), with @slow_test
, that runs this same test against a real device? It seems to be failing (tested with ibmqx4
):
self = <test.python.test_ibmqjob.TestIBMQJob testMethod=test_run>
def test_run(self):
backend = self._provider.get_backend('ibmqx4')
qobj = qiskit._compiler.compile(self._qc, backend)
quantum_job = QuantumJob(qobj, backend, shots=1024, preformatted=True)
job = backend.run(quantum_job)
> result = job.result()
test/python/test_ibmqjob.py:59:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
qiskit/backends/ibmq/ibmqjob.py:63: in result
this_result = self._wait_for_job(timeout=timeout, wait=wait)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <qiskit.backends.ibmq.ibmqjob.IBMQJob object at 0x7f4661bc45f8>, timeout = None, wait = 5
def _wait_for_job(self, timeout=60, wait=5):
....
# Get the results
api_result = self._api.get_job(job_id)
job_result_return = []
for index in range(len(api_result['qasms'])):
> job_result_return.append({'data': api_result['qasms'][index]['data'],
'status': api_result['qasms'][index]['status']})
E KeyError: 'data'
qiskit/backends/ibmq/ibmqjob.py:245: KeyError
test/python/test_ibmqjob.py
Outdated
num_queued, num_jobs) | ||
self.log.info('number of currently running jobs: %d/%d', | ||
num_running, num_jobs) | ||
self.assertTrue(all([(job.running or job.queued) for job in job_array])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you replace this line with an slightly expanded version? Something along the lines of:
self.assertTrue(all([(job.running or job.queued or job.done) for job in job_array]))
# Ensure job ids are unique.
job_ids = [job.job_id for job in job_array]
self.assertEqual(sorted(job_ids), sorted(list(set(job_ids))))
# Wait for all the results.
result_array = [job.result() for job in job_array]
# Ensure all jobs have finished.
self.assertTrue(all([job.done for job in job_array]))
self.assertTrue(all([result.get_status() == 'COMPLETED' for result in result_array]))
The or job.done
, along with the fix mentioned by @ajavadia at https://github.com/QISKit/qiskit-sdk-py/pull/403/files#r186906393, should be enough for passing the test in its current state - there is the minor issue that the line would make three queries to the API (one for each job.something
) and increase the running time, but seems we can afford that for the moment.
The rest of the lines are for making the tests more complete: not only launching the jobs, but ensuring that they get completed and that they handle different job_ids. For the simulator, it does seem indeed that all the jobs will be in job.done
state by the time they reach line 91 (but it seems fine too).
test/python/test_ibmqjob.py
Outdated
num_queued, num_jobs) | ||
self.log.info('number of currently running jobs: %d/%d', | ||
num_running, num_jobs) | ||
self.assertTrue(all([(job.running or job.queued) for job in job_array])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also expand this line in the same way as the note in test_run_async_simulator()
?
Thanks for the much needed tests, @ewinston ! For the rest of the team: please note that the tests are not run on travis and they need to be launched manually - and are currently failing, unfortunately. The good news is that at least one of the remaining issues that is causing seems to be easy to fix with the proposed fix by @ajavadia - and the other failing test seems also tiny. |
test/python/test_ibmqjob.py
Outdated
quantum_job = QuantumJob(qobj, backend, preformatted=True) | ||
job = backend.run(quantum_job) | ||
while not job.done: | ||
print(job.status) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be self.log.info
instead of print
. It prints out a lot of extra lines during testing.
When submitting to a backend with a disabled queue the job would wait in the INITIALIZING state indefinitely
Thanks @ewinston , and basically all team that was involved in this PR - merging! 🎉 |
* Created Job class for backend rework. * updated dummybackend to use BaseJob * test_quantumprogram passing needed to comment out async methods and associated tests. * passing all tests * added simple job run tests for local and IBM Q * hide _run_job method for backends * passing tests and linter * qasm cpp simulator passing tests. * linting * remove enum.auto for python 3.5 * more linting * resolve brokenprocesspool on mac * remove uneeded tests * bug fixes from merge conflict * race condition in job_cancel test * fix inadvertent change to a Makefile * minor commit * pre- moving ibmbackend job methods to ibmqjob. * remove timeout from QuantumJob - also move some job related methods from ibmqbackend to ibmqjob - ibmqjob implements concurrent.futures-like API without using it internally. * merge reorder_bits with async jobs * minor bug fixes * remove some uneeded timeouts. fix creation of job result struct. * add async tests. make ibmq status atomic. * added async tests and minor refinements - added async test - added backend_name method to IBMQJob and LocalJob - cleaned unused timeouts for QuantumJob - added queued method to IBMQJob and LocalJob * correct KeyError in _wait_for_job * make ibmqjob submission async * update ibmq job device test * change wait_for_results logic. add ibmjob device test. * fix register size exception test * minor linting issue * extended ibmq job tests as recommended by @diego-plan9 * adjust ibmqx4 device test for noise * change print to log.info in test_ibqjob * fix minor bug * fix infinite job initializing When submitting to a backend with a disabled queue the job would wait in the INITIALIZING state indefinitely
This PR is to rework the submission and management of jobs.
Description
With PR#376, backend rework, we would like a job instance to be returned in a call to the backend's
run
method;job = backend.run(qobj)
This call is meant to be non-blocking. The job object can be queried for status and results;
Since the implementation of some of these methods may be backend dependent this PR requires the backend to define it's job class which derives from
BaseJob
. The reference implementations for local and ibmq backends are simple and currently usefutures
.Motivation and Context
How Has This Been Tested?
Screenshots (if appropriate):
Types of changes
Checklist: