[2007] Ready for processing - Waiting for process... for days

R

Robert X

Hello,

The topic of a lot of posts in this news group is "how to kill the queuing
process" which locks the Check In process, which locks the Save process...
My question is: did you notice a specific set of circumstances which
generates these Queuing locks. How to avoid them?
NB: I really miss the spooler errors of PS 2003: they where so gentle!

Robert
 
S

Sharry Heberer [MSFT]

Here are some helpful hints and some things you can do to investigate why
the Queue is not processing jobs for a particular project:

1. Is your Queue still running properly - if it still processes jobs other
than the ones for the projects that are "blocked" then there is no need to
kill the Queue service - it is behaving as designed. Yes, the spooler was
nice, but it also allowed projects to get into an inconsistent/corrupted
state. Here we try to avoid that by disallowing processing after a core job
fails (like when a Save fails for some reason). In this way we can help
keep bad data from ending up in the Published DB and the Reporting DB, and
hence on Reports.

2. Use the Manage Queue page to look at correlations (use the CorrelationUID
column for help here) to see why a certain correlation is blocked. If you
cannot see any problems and your queue is still working, then your filters
on the Manage Queue page are probably not right - check them, especially the
History section (the problem may have actually occurred days ago). Using
the "By Project" filter works nicely for looking at the queue job history of
projects. For other correlations, use CorrelationUID.

3. Look for jobs in the Failed and Blocking state - those are the jobs that
are "blocking" others on the same correlation (again, use the correlation
UID here to see what jobs are affected). You can either retry these jobs if
the error looks like something having to do with something recoverable (like
loss of network or DB conn), or you can cancel. Canceling with the default
settings will cancel the entire correlation, so make sure you know what data
you could be losing by doing so.

4. Then look to see if maybe there are jobs stuck in the "Getting Enqueued"
state. If so, WinProj needs to be opened again on that user's machine who
submitted the job to see if WinProj will continue sending the project. If
that doesn't work, then you will need to cancel the jobs in this "getting
enqueued" state. Note that this effectively means that the save from
WinProj never happened, and that data will need to be resaved again. This
is the same thing that happens when you just blindly kill/restart the queue
service. But at least doing it this way means that you know what is being
lost, and which projects may need special attention later.

5. Look at the error (click the link in the Error column) to get an idea
about why the failure occurred. Sometimes you can correct the problem and
re-save/re-submit your job.

6. Start comparing Event Logs to what you've found on the Manage Queue page.
Look for errors around the same time as failed jobs in the queue.

7. ULS Logs. Same technique as #5 - look for errors around the same time as
failed jobs in the queue.

Once you clear the blocking job(s), the queue should immediately resume
processing on that correlation again, and pick up from where it last left
off (except, of course, if the jobs were all canceled in the process of
performing the steps above).

Hope this helps to clear up some misconceptions about the Queue. Please
re-post if you have a specific question - I'd be glad to help.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top