D²OL: Developer update

TAandy

Diamond Member
Oct 24, 2002
3,218
0
0
Dear Community Members!

I am pleased to inform you that we have hired three software developers and software engineers to push the development of D2OL version 2.0. The next public release will be Version 2.0 with one intermediate version 1.5 being released solely to the beta tester community. I would like to invite more people to join this community to enhance our testing capability.

There are several fundamental changes to the system that are aimed to improve the stability and scalability of the project. To better explain the changes and how they will affect the users I would like to briefly outline the architecture of D2OL Version 1.0

The original system was designed to accommodate a few thousand users. The most important parts of the server-end of this system are:

1. The Scheduler ? This system hands out tasks to nodes.
2. The Results Handler ? This system receives the results and parses information to the Statistics Server
3. The Statistics Server ? This system keep track of the results that were received from the nodes and updates the statistics pages.

In version 1.0 all three of these were physically located on the same computer and were not able to be separated onto physically different machines. This meant that increased user numbers could not be easily accommodated by simply adding more hardware. We would have to run several different instances of D2OL to accommodate high user numbers. This is not practical since load-balancing a system like that would be impossible without major code changes. This arrangement also made debugging the system very difficult. Over the past year several improvements were made but we came to the realization that we could not identify the source of two major problems that kept on bringing the system down. This in combination with several user requests for added functionality and increased queue sizes, prompted us to completely redesign the system. Version 2.0 now will have the following components:

1. REGISTRATION SERVER ? This system handles Node registration and functions independently of all other systems. We often had users reporting problems with node registration. We hope that those problems will all be solved (unless it is a firewall configuration issue) once this system is in place. As will all the components, this system is scalable by simply adding more servers.
2. CONFIGURATION SERVER ? This system is a central information system which communicates with the nodes and basically tells them what servers are available and, if a given server is not responding, what alternative servers to use. This system was put into place to manage all the servers that make up D2OL and to improve scalability. The servers are no longer required to be in the same physical location, since the configuration server will simply direct the node to an appropriate IP.
3. SCHEDULER ? This system is still responsible for handing out tasks. It was redesigned to handle much bigger WU queues and is independent of all other systems. Should the result handler go down, it will keep on handing out task to nodes requesting work. Although we anticipate being able to hand out very large numbers of tasks, we will initially set a limit of 1,000. If the system shows that it can handle it, we will increase this number until an optimum value has been reached. This system can easily be duplicated, with the configuration server keeping track of which schedulers are available.
4. RESULTS HANDLER ? This system receives the results from the client. In version 1.0, the result handler was responsible for parsing the incoming information and sending a small file to the statistics server. This function is now being delegated to the node. This is a trivial task but requires the result handler to be in constant communication with the statistics server. A breakdown in communication between these two systems in version 1.0 has been identified as a possible source for some of the problems we have experienced. Again this system is independent and can function even if the scheduler is down.
5. STATISCTICS SERVER ? This system receives files from the nodes (via the results handler) which contain statistics relevant information. The server processes these files an updates the statistics information in regular intervals.
6. STRUCTURE SERVER ? This system is responsible for uploading tasks to the client. This system will be completely independent from the website which includes the forums. This ensures that the website and forums will remain active even if the structure server should be down. The system is completely independent from all other servers and can be duplicated to improve scalability.
7. STATISTICS VIEWER ? This system serves the JSP pages which display the statistics. As part of this update we will make an XML API available for users who would like to obtain the statistics to display on their own web-sites. Bots and 3rd party websites should use the new API rather than screen-scraping the HTML. We welcome any input on the format of the API.
8. MEMBER SERVICES ? This system handles all User and Team related information. We have received several requests for users to be able to better manage their accounts. Most if not all of these will be addressed in Version 2.0
9. WEBSERVER ? This system will server our static WebPages and the forums.

One of the bottlenecks experienced during times of high activity was the bandwidth. This and the problems with upload/download file integrity have prompted us to reduce the size and number of all files transferred from and to the client. The WU?s consist of three files which will be compressed into a single file. Similarly, the size of the results file will be reduced by pre-processing it on the client. This reduces the size of the file to a few kilobytes. This will make the life of modem users much easier.

With the modem user in mind, we have given the user more control over the client. The user can force uploads and downloads to minimize the time spend on-line.

I do not want to give everything away, so I will just say that the GUI will have some new features and increased functionality.

The most important new feature of Version 2.0, which addresses a long known issue with communication between the users and the server-jockeys at the Rothberg Institute, is the status web-page. This will be a page that will show the status of the individual servers in real time. A second page will be added in the near future which shows a more in-depth analysis of the results received. This system is under construction and will go live at the end of summer.

Stay tuned for regular updates and news on pogress.

Best regards
Wolfgang Hinz

--------------------
Wolfgang Hinz - Senior Research Scientist Chemistry/Informatics
The Rothberg Insitute for Childhood Diseases
530 Whitfield Street
Guilford CT, 06437
www.childhooddiseases.org