Tuning Docuware 6.* on a large network We are defining a LARGE network as a BUSY Docuware network. It may only have 5 people adding content and 10 people using that content but those people adding content are killing the system with 10K to 100K images each day. Even worse they may scan in color and the bigger the images, the more the impact to the entire system. A LARGE network could also be a busy network with lots of users editing and changing lots of items all day long in cabinets marked for full text indexing. This can make the system busy all day and all night. IS may think they just throw more memory and more CPU at it but that normally does not solve the problem. This is a discussion of the different parts of the architechual foundation and how to make the most of them. The “Thumbnail service” is a service that not only eats at CPU time but can interrupt and tremendously slow down a system directly affecting the user. Our advice TURN IT OFF! On smaller systems, even on local systems, the change may not be dramatic but on larger segmented systems the service can add 500% more time to a simple process like un-stapling documents. You can test this very easily by opening a tray and drop a large number of files into it and staple them together. Now with the Thumbnail service turned OFF, un-staple them. Seems instantaneous, staple them back together and turn the service back on and un-staple them again. You will probably see the circle time imposed over the image outline for seconds an image and not the nearly instant display you had before. I have seen systems that took 5 minutes to display the images with the service on, off it is as a user expects very quick. So TURN IT OFF.
“LOCAL Full Text Services/Reading” Although I have yet to see a way to tune the fulltext engines easily the most important thing is to be aware of them, turn them off or move them when you can. For example, at the workstation when users start to complain it is taking too long to process an import or a scan to the basket then you have 2 things you can do: Remove the thumbnail service and turn off the fulltext/index on the LOCAL machine. The LOCAL machine is being used to OCR the records before they are sent to the server. Unfortunately, Docuware does not take into account the impact that this has on the local machine at the time and it can be devastating. “Full Text Services on Server” Fulltext as a service is more of an ON or OFF service. Although there are things you can do to adjust the way it runs when doing background work the issue is that it often runs on every image that is displayed. That is because when an image is read (OCR) by the engine the text is stored for indexing but the LOCATIONS of that text are NOT stored. In order to display the image and highlight the full text selected Docuware must RE-OCR every image as it is displayed and then based on that read highlight the text accordingly. LOTS of horsepower is used to do this. You can manage this by moving this to another machine. It helps more than you think. “SQL” There is a big difference between MYSQL on the local server and MSSQL on a separate server. It would take a very long time to describe all of the benefits of moving the database to another separate server so I will just say it this way, DO IT! Docuware agrees. “STORAGE” Storage would not seem to be a thing to move off a server having what you need close sounds like what we need but when you think about it who uses storage more, the users or the services? The answer is simple, the services, even if your user base searches and retrieves every image every day. If a cabinet has full-text enabled the FULLTEXT engine will have seen that document at least 3 times before you did! When you store the record it is read for indexing EVEN IF YOU DON’T USE IT…..for point and shoot. After the image is stored the full-text engine will read and re-index images as needed. Sometimes overnight or during the day whenever there are records to process. You want the fulltext engine to have access to the images away from the users so the users experience is the best it can be. Even after all reading and re-indexing and storing, the engine will read it once again every time you display it. It is a great deal of repeat business making it prudent to keep full-text close to storage and away from the users experience. Docuware Recommends: Many of the larger systems have SQL farms where multiple
Microsoft SQL Servers are teamed up to provide fast service. You are not going
to be able to move storage or full text onto those systems. Many large users, including my company, store
on Network Attached Storage Devices ( NAS ) and not file servers. NAS and full-text
do not go together at all. You want an
application server separate from the rest with it’s own memory and the ability
to respond very quickly. Solr has good documentation on how to make the system
more responsive and better suited to read documents.
DATABASE: On the SQL Server/SQL Farm Watching the network and the servers you can see it is a very good balance for the users and the services. Building a
Docuware Foundation: More Databases=More diversity: Clients often have requirements revolving around some common
concerns. Some of these concern mixing
Documents from Different Departments into the same Database. Seeing that out of
the box Docuware puts all of the file cabinet tables in the same Database you
have to know a little about SQL to get around this need. At one client they have Local, State and
Federal Grants used to manage their operation.
Each of these may have rules requiring them not to intermingle data,
images and resources with other processes.
We setup a separate storage unit for the Federal Documents and a
separate Database for the Federal File cabinets etc. Docuware can handle this as you can define
each of these needs and use them as you see fit. Now the system will be faster, more reliable
and easier to manage and move. Indexes: File cabinet Design: Storage: From a management point of view in a larger system having different storage locations for different office groups can be easier to manage and backup. By separating different departments work into different storage locations you can better track how much space each department is using and using auditing techniques you can see which ones are the busiest. They may bill back the departments for the space they use making this technique a major part of their process. Enterprise systems: There is a good and a bad to this layout and only for the workflow servers, they never fall back to each other. Workflow is a defined service and MUST be run on the server that it is designed for. You can only define 1 workflow server to a process. So having 2 is good and bad. If forces you to load balance on experience. You would now have 6 machines to power Docuware. You would think that is enough. Perhaps you have very little to store but
your users group is really BIG! You might flip this scenario. If the users are needing more access than I
would keep ALL of the Docuware services on the main server and move the web
server. The main point is not to look to software for all of the problems. Many are just architectural issues and can be solved in a different way. Conclusion: Although out of the box Docuware runs very well for many very small or even medium sized companies, it may not be the best configuration for a larger more diverse organization. Designs for every organization must consider the hardware available and the load on the network as well as the obvious things like number of images and users. The most obvious are the least of your worries and the foundation of the system is really what sets how everything works together. We have seen large companies with many users have very low CPU loads with little diversity and we have seen medium sized organizations with 4 or 5 departments bury the computers in so much work they barely meet the need. Throwing more memory and more processors seems like the place to start, yet it may not work at all. Looking at the big picture, where the data is going, how it is getting there and what happens to it as it is processed is more than a low horse power issue. It could be that the CAS is too slow to meet the demand or the NAS is not diversified enough, or sometimes it is simply that full text and storage needs to move off the main server to its own box. Although Docuware has recommendations for building and using their system remember they are being built on other systems they have no control over. Storage, Network even the SOLR full text engine can all be optimized by you for a very fast and efficient system. In the end, Docuware is very fast and capable of meeting any size company’s needs. |