I came across this issue many times so thought of sharing resolution with you so that it can save some time.
Here is the scenario:
You have SQL Server instance on 2 node cluster, SQL Agent resource is running fine on one node but if you do a fail-over it fails to come online on the other node.
Below are the steps of troubleshooting.
As a troubleshooting step we go and check the event log and event log post error message as follows.
[sqagtres] OnlineThread: ResUtilsStartResourceService failed (status 435) [sqagtres] StartResourceService: Failed to start SQLSERVERAGENT service. CurrentState: 1 [sqagtres] OnlineThread: Error 435 bringing resource online.
Not getting any clue with above error message, so thought of checking agent error log (SQLAGENT.OUT) to get some more details about service but did not find SQLAGENT.OUT file in the Log folder.
Since file itself is not getting generated so started the service from command prompt to get some more information and it will also be helpful to find out if any permission issue is there.
Below is the result when service is started from command prompt.
Microsoft (R) SQLServerAgent 10.50.4000.0 Copyright (C) Microsoft Corporation. StartServiceCtrlDispatcher failed (error 6)
Error 6 indicates “The handle is invalid”
We ran procmon. Since SQLAGENT.OUT is not getting generated and from the error it does not look like we have permission issue as command prompt is launched from local administrator account, in procmon we analyzed below entry carefully
"1:01:48.8321460 PM","SQLAGENT.EXE","12245","RegQueryValue","HKLM\SOFTWARE\Microsoft\Microsoft SQL Server\
MSSQL10.50.MSSQLSERVER\SQLServerAgent\ErrorLogFile","SUCCESS","Type: REG_SZ, Length: 84, Data: F:\Microsoft SQL Server\MSSQL\Log\"
We noticed Error file name SQLAGENT.OUT is missing in the end. To make it 100% sure we compare highlighted (highlighted in red above) with the value present in active node and found working node had below entry.
Value in working node: F:\Microsoft SQL Server\MSSQL\Log\SQLAGENT.OUT
Value in problem node :F:\Microsoft SQL Server\MSSQL\Log\
We added this file name (SQLAGENT.OUT) in the registry.
HKLM\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10.MSSQLSERVER\SQLServerAgent\ErrorLogFile
“F:\Microsoft SQL Server\MSSQL\Log\SQLAGENT.OUT”
After that we started the service, service came fine and SQL Agent resource also came online.
Hope this will help you in future!!