Hello, we are having an issue with one of our job servers. When ever I try to do a publish it sits there and does nothing. Then after 30 minutes it will get canceled. It will go to Job begin to Dead_Lock, then sit there until it timesout. We do have another job server that is working correctly so we have all the same settings between the two. I have disabled the working one to focus on the non-working one at the moment.
Also I have deleted the cluster and re-added job servers as suggested by IBM website.
The log files tell me to make the timeout longer, but that didn't work and would just force it to sit there longer.
Please help with some suggestions on what I should look for and any certain tests I can perform to narrow down the issue.
thank you
Hey,
A few things:
1. What version of planning are you using?
2. What services do you have enabled in cognos configuration for the 2 job servers?
3. Do reconciles, admin links, etc run fine on the problematic job server? Is it only publishes that cause problems?
4. Has this machine ever worked correctly? If so, are you aware of any changes?
hi,
Do both job servers have the same CPU and type (physical / VM) ?
Thanks for the quick replies:A few things:
1. What version of planning are you using? -- 10.1.1
2. What services do you have enabled in cognos configuration for the 2 job servers? I can't get to the config application on the server since I do not have remote ability. Locked down pretty good here. But from what I see in the application the services are:
AgentService
AnnotationService
BatchReportService
ContentManagerCacheService
DataMovementService
DeleiveryService
EventManagementService
GraphicsService
HumanTaskService
JobService
LogService
MetadataService
MonitorService
planningAdministrationConsoleServcice
planningDataService
planningRuntimeService
planningTaskService
QueryService
ReportDataService
SystemService
3. Do reconciles, admin links, etc run fine on the problematic job server? Is it only publishes that cause problems? Nothing runs on this service, everything fails to start.
4. Has this machine ever worked correctly? If so, are you aware of any changes? Nothing ever worked on this server
Topedgemonk: I'm guessing that they don't have the same CPU since the working one is VM and the non working one is an actual server.
Its the same setup we have in Dev, and they are working just not in Test
This is a shot in the dark, but I think you might be running into a known problem. This may not be the case in your particular instance, but we can start with the easiest and work our way down.
You'll need to confirm if the planningAdministrationConsoleService (Contributor Administration Console Service in cognos configuration) is set to true or false on this job server. Depending on the value, you will want to compare it against your working environment.
The defect is that if the CAC service is present on a job server, but the service is set to false, no jobs will process on this machine. So to correct it, you would set the service to true, restart services, and test.
Hopefully this is the issue, if not, then please post the planningerrorlog and we can take a look.