Friday, May 11, 2007

Weblogic Errors & Resolutions

  1. Change Development Mode to Production mode in Weblogic: Some of you have requested how to change Weblogic start up mode from DEV to production or vice versa. Actually it is very simple. One way to change it is, by simply editing setDomainEnv.cmd which resides in $root_domain/bin folder.
    1. Look for the line that sets the PRODUCTION_MODE script variable: set PRODUCTION_MODE, set false to the value of the PRODUCTION_MODE variable to ensure the server starts in development mode,set true for starting in prod mode. [set PRODUCTION_MODE=false]
    2. Save your changes and exit the text editor.
  2. Authentication Denied - Boot identity not valid: "<11-May-2007 20:00:57 UTC> <Critical> <Security> <BEA-090402> <Authentication denied: Boot identity not valid; The user name and/or password from the boot identity file (boot.properties) is not valid. The boot identity may have been changed since the boot identity file was created. Please edit and update the boot identity file with the proper values of username and password. The first time the updated boot identity file is used to start the server, these new values are encrypted.>" Follow the following steps:
    1. Remove the boot.properties file completely from the managed server directory.
    2. Remove the ManagedServerDir/data/ldap directory completely.
    3. Always keep backup of files before removing it.
    4. Now try to start the server from the command prompt and provide the username/password used to login to the Admin Console.
  3. Recover SerializedSystemIni.Dat When It Gets Corrupted: Follow following steps in order to recover SerializedSystemIni.Dat file:
    1. Go to DOMAIN_HOME/config
    2. Open the config.xml file and remove any encrypted sections from the <credential-encrypted> attributes. Replace its content with “”, Save the file.
    3. Go to DOMAIN_HOME/servers/AdminServer/security
    4. Remove the boot.properties file if exists.
    5. Go to DOMAIN_HOME/security
    6. Remove SerializedSystemInit.dat
    7. Go to DOMAIN_HOME/ and Rename the fileRealm.properties to fileRealm.properties.src, then open fileRealm.properties.src and change all the hashed passwords (encrypted passwords) to clear text passwords.
      1. For example:  user.system=0xaxyzb45e6f6c4eefdd1f14495ff739b5536904c to user.system=password
      2. Ensure to use the same password that was set to the domain.
      3. Save the file.
    8. Open a terminal and go to DOMAIN_HOME/bin
      1. Execute source ./setDomainEnv.sh
      2. Then execute (in the same open terminal) the following script to re-generate 'SerializedSystemIni.dat' and 'fileRealm.properties' file:
        1. java weblogic.security.acl.internal.FileRealm fileRealm.properties SerializedSystemIni.dat
    9. Now open startWeblogic.sh located at DOMAIN_HOME/bin and add this line in JAVA_OPTIONS=-Dweblogic.system.StoreBootIdentity=true, This will create boot.properties file. Save the script & Start Weblogic with startWeblogic.sh or startWeblogic.cmd and give enter the user and password when asked.
  4.  Exception Snippet: <Dec 31, 2013 2:06:30 PM UTC> <Error> <JTA> <BEA-114089> <User [<anonymous>] is not authorized to invoke ackPrepare on a Coordinator.>
    <Dec 31, 2013 2:06:30 PM UTC> <Error> <JTA> <BEA-110495> <User [<anonymous>] is not authorized to invoke AckRollback on a Coordinator.>
    1. Bea Cause and Action are not very useful, i played with 'Cross-domain security' and 'interoperability mode' and reproduce this problem and come across, finally i came to conclusion that these errors only happened when i had a cross domain security enabled (with identical credentials on both domain - or for some reason some one trying to setup global trust along with cross domain security ).
  5. Inter-domain transaction: The domains and all participating resources must have unique names. That is one cannot have a JDBC data source , a server or a domain with the same name as an object in another domain or the domain itself.
  6. Address Already in Use: 
    1. ERR: transport error 202: bind failed: Address already in use
      Starting weblogic with Java version:
      ERROR: transport error 202: bind failed: Address already in use
      ERROR: JDWP Transport dt_socket failed to initialize, TRANSPORT_INIT(510)
      JDWP exit error AGENT_ERROR_TRANSPORT_INIT(197): No transports initialized [../../../src/share/back/debugInit.c:690]
      FATAL ERROR in native method: JDWP No transports initialized, jvmtiError=AGENT_ERROR_TRANSPORT_INIT(197)
      1. Root Cause: Above error comes due to port usage problem, if during server stop/kill process if the port which weblogic server is using not reclaimed by OS and user again issued the command to start the Weblogic server instance then servers try to bind the same port while another process is already running on that port and Weblogic server fails to start.
      2. Solution: This error mostly comes when user forcefully kill the server, so before restart please check all the ports status which weblogic using e.g. http port , ALSB_DEBUG_PORT and DEBUG_PORT  by using this command -
        netstat –an | grep <port no> # to see which whether this port is active or not.
        e.g. netstat –an | grep 9001
        If you get the result from above queries then please wait and give some time to OS to reclaim this port, most of scenarios Port will be get free after some time and you can restart the servers again.
        In case above port is not getting released it simply means there is some process holding this Port, that process could be existing Weblogic PID or some demon thread, engage your Unix Administrator to find out that process and kill the process.
        Also we can bounce the whole server to release this port.
        In case of if this problem is coming for ALSB_DEBUG_PORT and DEBUG_PORT , then you can change this port number as well since these port are internal to Weblogic and does not used by any external system communication.
  7. Exception while starting Weblogic Server: Err: Error occurred during initialization of VM
    Could not reserve enough space for object heap
    1. Root Cause: You would get better help for this on WebLogic Server or a Java forum. The above problem is a fairly typical JVM and operating system interop issue. On startup JVM tries to reserve a contiguous block of virtual memory for heap and permgen. If it is unable to do so, it fails to start. On 32-bit operating system, virtual memory address space is 4GB, but drivers and operating system processes can breakup the available space into smaller chunks such that Java is not able to allocate a contiguous block that it requires. Any number of things could of caused this to start happening all of a sudden. Anything from a Windows update to installing new hardware or software.
    2. Several Possible Solutions:
      1. Tries to reduce memory allocated to WLS java process. Move down small increments at a time until the server starts e.g. to 2gb/1gb/512m. Perhaps you can get your app to run with reduced memory.
      2. Switch to a 64-bit operating system and a 64-bit JVM. Note that you don't need more than 4GB of actual ram to get a benefit from a 64-bit environment in this case. It is the size of the virtual memory address space that counts. You will need to seek advice on WebLogic Server forum regarding configuring WLS to run with a 64-bit JVM.
      3. Switch to running WLS using Jrockit. I do not believe Jrockit has the contiguous memory requirement.
      4. Do some low-level debugging to identify which drivers or dlls are loaded where in memory and uninstall offenders or attempt to move them. You can find information on this via Google, but I don't really recommend doing this unless you enjoy low-level debugging and have some experience with it.
      5. If above error is coming due to less memory allocated while your JVM need more then try increasing the memory as well in incremental mannerk solution could be increase the Min and Max size of JVM if your server have enough Memory , as a best practise Min and Max size always should be equal so that during Initiliation itself JVM will reserve that much memory.
      6. In setDomainEnv.sh file find out this properties ‘EXTRA_JAVA_PROPERTIES’ and add this lines “-Xms3072m -Xmx3072m” or “-Xms3g –Xmx3g” Or in Weblogic Admin server console, click on server, go to “server start” tab page and specify JVM parameter “-Xms3072m -Xmx3072m” or “-Xms3g –Xmx3g”
        Verify the .out file whether changes is getting effected or not.
  8. Err: Could not obtain an exclusive lock to the embedded LDAP data files directory
    <31/05/2012 4:32:08 PM EST> <Error> <EmbeddedLDAP> <BEA-171519> <Could not obtain an exclusive lock to the embedded LDAP data files directory: /hta/home/fusion/osb_home/osb_install1/user_projects/domains/vhaosb-dev2/servers/AdminServer/data/ldap/ldapfiles because another WebLogic Server is already using this directory. Ensure that the first WebLogic Server is completely shutdown and restart the server.>
    <31/05/2012 4:32:16 PM EST> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to FAILED>
    <31/05/2012 4:32:16 PM EST> <Error> <WebLogicServer> <BEA-000383> <A critical service failed. The server will shut itself down>
    <31/05/2012 4:32:16 PM EST> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to FORCE_SHUTTING_DOWN>
    1. Root Cause: Some time even after proper shutdown or forceful shutdown .lok file does not get removed automatically and when users try to restart the server again and since file is already present it server fails to start the servers.
    2. Solutions: Navigate to Idapfiles location under server and delete the “EmbeddedLDAP.lok” file from there, location would be e.g. /hta/home/fusion/osb_home/osb_install1/user_projects/domains/vhaosb-dev2/servers/AdminServer/data/ldap/ldapfiles and restart the Weblogic Server.
  9. Err: The persistent store "_WLS_AdminServer" could not be deployed: <01/06/2012 10:56:47 AM EST> <Error> <Store> <BEA-280061> <The persistent store "_WLS_AdminServer" could not be deployed: weblogic.store.PersistentStoreException: [Store:280105]The persistent file store "_WLS_AdminServer" cannot open file _WLS_ADMINSERVER000000.DAT.
    weblogic.store.PersistentStoreException: [Store:280105]The persistent file store "_WLS_AdminServer" cannot open file _WLS_ADMINSERVER000000.DAT.
                    at weblogic.store.io.file.Heap.open(Heap.java:312)
                    at weblogic.store.io.file.FileStoreIO.open(FileStoreIO.java:104)
                    at weblogic.store.internal.PersistentStoreImpl.recoverStoreConnections(PersistentStoreImpl.java:413)
                    at weblogic.store.internal.PersistentStoreImpl.open(PersistentStoreImpl.java:404)
                    at weblogic.store.admin.AdminHandler.activate(AdminHandler.java:126)
                    Truncated. see log file for complete stacktrace
    <01/06/2012 10:56:47 AM EST> <Critical> <WebLogicServer> <BEA-000362> <Server failed. Reason:
    1. Root Cause: Above error comes mostly when weblogic does not able to read the file store .DAT file. This file weblogic uses for its internal working. There are 7 weblogic sub systems which write information in this file e.g. Diagnostic Service, JMS Messages, JTA Transaction Log (TLOG), Web Services, EJB Timer Services, Store-and-Forward (SAF) Service Agents and Path Service. Also this file could be corrupted as well during forceful shutdown when users use kill -9 command, or reason can be anything else as well.
    2. Solution
      1. cd $DomainHome/servers/AdminServer/data/store
      2. find . –name  *.DAT
      3. Verify the file name in your result and error message should be same.
      4. Rename this file and move from this directory to some other directory.
      5. find "EmbeddedLDAP.lok" and "AdminServer.lok" as well and remove the same.
      6. check the port using netstat –an | grep <Weblogic server port>, there should not be any open connection to this port.
      7. Start your weblogic server either via weblogic script or node manner.
        1. Note: .DAT file is very important file, and contains business data as well in Production system. Please take a backup of this file before doing any operation on it, so that later this file can be analysed to complete those transaction.
  10. ERR: The loading of OPSS java security policy provider failed due to exception: <Jun 5, 2012 1:37:53 PM EST> <Critical> <WebLogicServer> <BEA-000386> <Server subsystem failed. Reason: weblogic.security.SecurityInitializationException: The loading of OPSS java security policy provider failed due to exception, see the exception stack trace or the server log file for root cause. If still see no obvious cause, enable the debug flag -Djava.security.debug=jpspolicy to get more information. Error message: JPS-02592: Failed to push ldap config data to libOvd for service instance "idstore.ldap" in JPS context "default", cause: org.xml.sax.SAXException: Error Parsing at line #1: 1.
    org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; <Line 1, Column 1>: XML-20108: (Fatal Error) Start of root element expected.
    1. Root Cause: I need to find out exact root cause of this problem. As of now I am not quoting any details for why this error came but the solution which I have followed I am specifying here -
    2. Solution: 
      1. delete “tmp” folder from the location $Domain\servers/AdminServer.
      2. find out all the .lok file and .DAT file and delete it.
        Find <location up to $Domain/servers/AdminServer> –name *.lok
        Find <location up to $Domain/servers/AdminServer> -name *.DAT
        Use the “rm” command to delete the files which comes as part of above find result.
      3. Restart the Weblogic server. 
  11. ERR: EmbeddedLDAP  java.lang.ArrayIndexOutOfBoundsException
    During Admin server restart we were getting below error continuously and Admin server was not coming up-
    ####<Nov 5, 2012 4:47:05 AM EST> <Critical> <EmbeddedLDAP> <vans075007> <AdminServer> <VDE Replication Thread> <<anonymous>> <> <> <1352051225634> <BEA-000000> <java.lang.ArrayIndexOutOfBoundsException: 0
    1. Root Cause: This Error comes when the changelog.data  and changelog.index file get corrupted located in servers/Adminserver/data/ldap/ldapfiles .
      Below are list of scenarios during when these file can get corrupted.
      While the admin server was writing an LDAP entry to the changelog, it was interrupted by a forced shutdown, which made the changelog partially updated.
      When the admin server rebooted, it attempted to process the changelog (i.e., send the entries to the managed servers), but encountered the partially updated changelog.
      The partial update was an entry that had been assigned a change number, but there was no data for the entry. When the change log writer is interrupted between the index update and the data update and this update is in a synchronized method.
    2. Solution: This Error comes when the changelog.data  and changelog.index file get corrupted located in servers/Adminserver/data/ldap/ldapfiles .
      Please take a backup of existing “ldap” folder deleted these two files, and restart the Admin servers. Both files will be created again and server will start successfully.
      The changelog.data file is used in WebLogic Server (WLS) to store LDAP information regarding users, groups, roles and policies. The EmbeddedLDAP server has an index file and a data file. Each entry in the data file is pointed to by a index file entry; the index file entry is dictated by an integer that identified the entry.
      For more details about issue and solution please refer the Oracle Note: DOC ID: 1325978.1, Following is list of general action which we can perfom in Test enviornment to fix the server startup problem:
      1. Try to delete all the .lok file
      2. Try to delete .DAT file insider $Domain/servers/AdminSever/step.
      3. Take a backup of "tmp" folder inside servers folder and delete “tmp” folder
      4. Take a backup of "data" foler inside servers folder and delete “data” folder
      5. Change the port number if some port is already in use.
  12. To be continued ...