Why does my location service hang? (osagent on multi-homed host)

From Support
Jump to: navigation, search

Question:

Why does my location service hang? (osagent on multi-homed host)

Answer:

Symptom: The osfind command and/or the  location service hang when an osagent is running on a multi-homed host. After several minutes an ObjLocation::Fail exception is thrown.  In some cases the hang appeared to be indefinite.

This note explains one posible cause of such symptoms for TCP based clients to the osagent.

The symptom described affects applications that use TCP to communicate with the osagent.

There are two types of clients to the osagent, those that use TCP based communications and those that use UDP based communications to talk to the osagent.  Normal VisiBroker applications use UDP based communication to talk to the osagent.  The only clients that use TCP are the location service (and osfind tool which is built on the location service). When an application used the location service API, it is in many cases using TCP to communicate with the osagent for those specific calls.

When an application establishes a dialog with the osagent it does so in a two step process.  It first sends a request fro a dedicated handler to the osagent via the well known osagent port (default 14000).  The osagent replies with the address of a dedicated handler for all subsequent communications.

In the case where the osagent is running o a multi-homed host, it tries to provide a handler address that is most compatible with the incoming client request.

The default behavior of the osagent is to set up the TCP and UDP client handlers on all addresses.  Each client logs into the osagent and requests the address and port of the handler for subsequent communications.  The osagent will by default provide the default host in the reply to the  client.  If the client cannot access that address then it will eventually fail to contact that osagent. That is what happened to cause the symptom described by this note.

For TCP based clients it is possible to configure the behavior of the osagent using the localaddr file, assuming that there is at least on IP address on the osagent host that can be reached by all possible clients.

For TCP based clients, if a localaddr file is provided to configure the osagent, then the osagent will attempt to match the client subnet and reply with a compatible address if there is one.  Otherwise is it will reply with the _first_ address found in localaddr.

If clients are on the same subnet as the osagent, but not on the subnet associated with the default host, then providing a complete localaddr file should be sufficient to solve the  problem.

If clients are coming from other subnets, the localaddr must be carefully configured based on the  application requirements such that the localaddr file limits the list of addresses to only those that are reachable by all potential TCP clients (osfind and location service).  Such configuration requires knowledge of the specific IP address, network mask, and broadcast mask for each client machine and the host where the osagent is running.  For example if the  host has one non-default interface that is accessible to all potential clients, then configure localaddr to that one specific address. This ensures that even though the osagent cannot match the client network, it will return the first (and only) address from localaddr, which the client should be able to contact given the assumption made.

Note that the recommendation is very different for UDP clients (normal VisiBroker applications). If a localaddr file is provided to configure the osagent, then the osagent will attempt to match the client subnet and reply with a compatible address if there is one.  Otherwise is it will reply with the _default_ address  of the machine. This behavior is different than that described above for TCP based clients.

Thus there is an inherent limitation which applies to UDP clients running on separate subnets.  These clients must be able to send UDP messages to the  default interface of the host that is running the osagent in order to communicate with that osagent.  An alternative configuration to work around  this limitation is to run osagents on each subnet and bridge the subnets using the agentaddr file for each osagent as described in the product manuals.
 

Platform/version differences:

For VBJ 3.x, VBJ 4.x, VBC++ 3.x, and VBC++ 4.x, the osagent is a platform native executable.

For VBJ 3.x, osfind and the separate locserv application are platform native executables built on the VBC++ ORB.  The locserv application provides the location service API.  The notes above apply to locserv and osfind, not the VBJ application that is accessing the location service.

For VBC++ 3.x, osfind is a platform native executable.  The location service API is built into the ORB library runtime. Thus there is no separate locserv executable involved.  The notes above apply to the built in location service API and thus directly affect the application making use of the location service.

For VBJ 4.x and VBC++ osfind is written in java and uses the location service API built in to the java ORB.  The location service is built into the ORB runtime (either java or C++ depending on which ORB is used).  The notes above apply to the built in location service API and thus directly affect the application making use of the location service.
 



Article originally contributed by Borland Developer Support