[ previous ] [ Contents ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ 9 ] [ 10 ] [ 11 ] [ 12 ] [ 13 ] [ 14 ] [ 15 ] [ 16 ] [ A ] [ B ] [ next ]

Smart Cache Manual
Chapter 9 - For Administrators

9.1 Smart Cache Performance

Smart Cache runs fast and has comparable performance with other (non-Java) proxy servers, such as Apache or Squid. Proxy servers do not needs much CPU power and real world performance is limited by filesystem, network speeds and by quality of TCP stack.

Smart Cache performs well on low end hardware. Any Pentium class low end system with 32MB RAM can do proxy for whole medium-size network without troubles. If you need proxy for your home network any 486 can do that.

Pentium 166, 64 MB RAM, Debian Linux 3.0, reiserfs 3 filesystem mounted with notail,noatime - proxy server for 50 users.

486 66, 32 MB RAM, OS/2, HPFS filesystem with 2048KB diskcache, IDE hardrives - proxy for 25 users.

9.2 Proxy cache Benchmarks

Test method: The benchmark is simple `give me the same page` test. I have run every test 3 times; highest number was recorded. Smart Cache configuration: -nolog -notrace -nofail -local. Other software has defaults. Test HTML page was 38314 Bytes long, you can get it from here. Numbers are in pages/min. The test execution time was 15 or 20 seconds. Testing software webbench 1.2. Log file analysis shows that one active browsing user needs about 50-60 requests per minute; this is counted as average from 10 minutes of activity.

Results shows following:

  1. only Java 1.1 runs faster with more memory available to JVM.

  1. -server switch has no speed increase effect. This switch should optimize JVM for server apps.

  1. JDK 1.3 seems to crashing sometimes when -server switch is used.

  1. Linux is 3 times faster than Windows XP

  1. Linux has no problems with benchmark load

  1. Windows gets failed connections (about 0.1%), TCP/IP stack crashes (about 1 per 5 webbench runs) and BSOD (2 times during benchmark). No of these problems shows during normal Scache use.

Summary: Classic scache startup switches: "java -ms1m " are best for everybody with JDK > 1.1.

9.2.1 Pentium 166/32MB

Hardware used: Pentium 166/32MB RAM, IDE hard drive (hdparm -t 7.8MB), Debian Linux 3.0, kernel 2.4.19, reiserfs 3.6.25 filesystem.

Software used: Sun JDK 1.1.8v1, tya JIT 1.7v2, Apache 1.3.24, GCJ-3.2, Squid 2.4.6, Smart Cache 0.77pre.

     smartcache-gcj 6512
     Squid 4528
     Apache Cache 3980
     Apache from localhost 3808
     Virtual machine: jdk 1.1.8v1 with green threads.
     no JIT              tya JIT
     default 2856            4500
     -ms1m   2792            4064
     -ms2m   3092            5348
     -ms3m   3768            5432
     -ms4m   3932            5468
     -ms5m   3976            5504 
     -ms6m   4048            5460
     -ms7m   4068            5592

These tests was before -local switch fix. These patches adds 4% to Smart Cache results. They should be used if you want to get comparable results.

9.2.2 Pentium IV/256MB

Hardware used: Pentium IV/256MB RAM, 256KB cache. IDE hard drive (hdparm -t 40.76MB). Windows XP Home, Knoppix Linux 3.1 with kernel 2.4.20-xfs, NTFS filesystem.

Software used: Sun JDK 1.4.1-rc (Windows), Blackdown JDK 1.3.1-02b-FCS, Apache 1.3.24/Win32, Smart Cache 0.86b4, webbench 1.2.

     Windows results:
     Apache serving local pages 9933
     Apache serving pages from mod_proxy 14886
     -ms1m  11850  -server 11193
     -ms2m  11649  -server 11214
     -ms3m  11766  -server 10455
     -ms4m  11619  -server 11379
     -ms5m  11733  -server 11268
     -ms7m  11769  -server 11160
     -ms16m 11906  -server 11253
     -ms32m 11760  -server 11124
     -ms64m 11673  -server 10983
     Linux results
     -ms1m 32118  
     -ms2m 32160
     -ms3m 31326
     -ms8m 32049  -server 32868
     -ms16m 32448 -server 32286
     -ms32m 32472 -server 33888
     -ms64m 32913 -server 33996

9.2.3 Case study #1 - 486/33 OS/2

On my old 486/33DX System with 20MB RAM running IBM JDK1.1.4 under OS/2 (JIT off), Smart Cache serves 350 pages/min (with JIT is under OS/2 about 25% faster, but sometimes crashes due to JIT bug). OS/2 port of Apache 1.1 server only about 250.

So 486/33 with Linux can serve 15 users without any problems, if there is sufficient amount of installed RAM (20MB) and machine do not run any CPU-intensive tasks on normal or high priority. I have used this configuration for 2 years on OS/2 which is even slower on filesystem operations than Linux.

9.2.4 Case study #2 - 486/100 Linux

On Linux 486 100Mhz system with JDK1.1.7 with tya1.1 JIT and green threads it serves 910 pages/min. Do not use native threads under Linux's Java 1.1, it is buggy, slow and memory hungry.

Linux can use different filesystems. Do not use VFAT unless you need to share your cache with Windows running on the same machine. In my benchmarks i have found that reiser fs (especially with notail,noatime) is much faster than ext2. Reiserfs can pack small files together into one disk block which saves a lot of diskspace on large cache, but makes it slower.

9.3 Administration of large cache

Large caches needs some special things to consider. Large cache is anything over 2 GB.

9.3.1 Physical storage of data

The best is to store cached data on dedicated physical drive. You do not need to backup it, if drive goes wrong; simply throw it away and plug a new one. You can take some used hard drives for that.

If you are not using dedicated physical drive, than use dedicated partition (volume, stripe) of fixed size. This saves you a lot of troubles when your server is short of diskspace.

9.3.2 Choosing a filesystem

Important: You _MUST_ use a journaling filesystem, if you do not want to wait hours after your server crash.

Linux: I recommend to use reiser3 filesystem mounted with notail,noatime option. It has 4kB blocksize and goes very well. I can not recommend to use JFS filesystem.

Windows: I recommend to use NTFS over FAT32. It has higher performance and NTFS seems to be more stable than FAT32.

9.3.3 Configuring swaplevels

You must change default values of swap_level1_dirs and swap_level2_dirs in scache.cnf In example file sample/scache.cnf you find table which lists minimal recommended values based on cache size.

9.3.4 Configuring max connections

Most browsers uses 4 network connections when talks to proxy server. Your network has 100 uses. You can assume than about 5% of you users are browsing web at one time. Set max_connections to 35, if your outgoing network connection can handle more, you can increase it. When you are editing configuration change also loglevel to 3 or 2.

9.3.5 Fine tuning garbage collection

Smart Caches Garbage collection on large caches can take some time. Art of administration large SC is to tweak gc.cnf values and then clear daily files with single scan garbage collection. How can you do it?

Smart Cache GC deletes by one cache scan at most gcarraysize objects + number of too old objects. You must make gcarray size to be large enough for holding your daily traffic minus files which has been delete during scan. Average file size is in most cases about 9-13.5kB; If you have 100MB daily traffic, set gcsize to 16384 (minimum recommended value for 10GB cache). The GC runs best when gcarraysize is power of two, because it uses binary search. Making bigger gcarraysize cost more CPU and memory during GC (not problem today). If you make gcarray size too big and OS will needs to start swaping, GC will run very slow or runs out of memory. 1 slot in gcarray takes about 1 KB of memory.

Because size of gcarray is never big enough on multi-gigabyte caches. Use settings like that for forcing fast cleaning of non-checkable and other junk pages during cache scan; it will save you valueable gcarraysize. Also is a very good idea to set reference_age to some smaller value, for example 120d.

     negative_cached_penalty =0.5
     expired_penalty =0.5
     expired_but_checkable_penalty +0
     redirect_penalty =0.5
     not_checkable_penalty =3
     reference_age 4months

9.3.6 Example large cache configuration

30 GB cache, gcarraysize 32768 (if you have 256MB of RAM or more, double it. You can use 64k even on 128MB of RAM, but i have lowered it to be 100% sure that it will not crash when someone runs memory hungry software Mozilla or Open Office there. ), swapdirs 32x32, high swap mark 95, low swap mark 85, 150 MB daily traffic. GC runs once per week and takes about 1 hour to complete. Hardware used: Linux 2.4.20 (needed for speeding reiserfs IDE writing), reiserfs filesystem (notail,noatime of course), dedicated 100ATA IDE drive for caching, 128 MB of RAM, 700Mhz AMD CPU.

9.4 Making native executable

Smart Cache runs when it is recompiled with GCJ to native executable. I have tested GCJ 3.2 and there are no problems with it. This native executable runs faster (at least on Linux), but eats more memory. Is is quite possible that you will not able to compile it due to bugs in GCJ.

You can build Debian GCJ compiled package scache-native directly from sources.

9.5 Smart Cache vs Squid

Squid and Smart Cache are not competing products. Both uses different designs; Apache Web server cache uses SC design also. This chapter will show some differences. In short Squid needs huge amounts of RAM.

9.6 Smart Cache memory usage

Smart Cache has been optimized for minimal RAM and CPU use. In normal use (by single browsing user), it requires only about 400Kb RAM from JVM. To get total size of required memory, you must add the shared dynamic (DLL) code loaded by the Java Runtime itself, which depends on Java versions, Java runtime options and operation system used.

9.6.1 Linux hints

On Linux with Blackdown JDK1.1.7 disable Motif Java interface (export NS_JAVA=true). You will need only about 2.2MB of physical RAM (green threads without JIT) or 2.7MB of physical RAM with tya JIT. When Smart Cache is in heavy use by multiple users add 350-500k.

9.7 Web forwarding with Smart Cache

This feature is currently disabled for security reasons

This is the setup for the web forwarder from www.firma.cz to www-orig.firma.cz. Smart Cache runs on www.firma.cz and forwards requests to www-orig.firma.cz. Some people use term web accelerator instead forwarder.

Web forwarders are useful for example if you need to maintain your WWW site locally, but have a slow line and many users, so put the forwarder on the ISP backbone, people will use forwarder, and forwarder will send back cached data from your server. Forwarder is also good when website uses dynamically-generated web-pages. If you have many users then many same pages must be generated for every request, which uses valuable CPU resources.

Example of scache.cnf:

     BindAddress *
     port 80
     Fail_file fail.cnf
     Pass_file pass.cnf
     default_forward_for www.firma.cz
     rewrite_file rewrite.cnf
     access_log http://www.firma.cz/*  firma.log
     NoCaching */cgi-bin/*
     default_refresh_pattern 15 60 0.20 5000
     pragma_no_cache 0

Example of fail.cnf:


Example of pass.cnf:


Example of rewrite.cnf:

     http://www.firma.cz/* http://www-orig.firma.cz/ 1

9.8 Using Smart Cache as transparent proxy

Smart Cache can be also used in transparent-proxy mode. Unlike Squid, Smart Cache does all security checks and you will not open new security hole. [5] You don't have to setup anything in scache.cnf, but you need TCP/IP stack or router hardware/software, which allow transparent proxying. Linux or Cisco operations systems can do that.

TCP-IP must be configured to redirect all traffic with destination port 80 to your Smart Cache's input port (same as for normal proxy requests). This configuration is operation system dependent, ask your network administrator for doing it.

In Linux, this is done with ipfwadm/ipchains/iptables programs. Consult NAT-HOWTO for more information. It is in section 6.2.

     ## Send incoming port-80 web traffic from interface eth1 
     #    to our SmartCache (transparent) proxy.
     iptables -t nat -A PREROUTING -i eth1 -p tcp --dport 80 \
                 -j REDIRECT --to-port 8080

9.9 Using cache directory swaplevels

SC by default uses 4x4 swap directory levels. If you look into it and see, that there are many (about 100) subdirectories with server names it may be wise to increase swap_level1/2_dirs and than redistribute directories to new levels using -rebalance command line option.

For classic design filesystems which does not uses trees for indexing directory (ext2,fat) there should not be more than 100 subdirectories. Modern filesystems HPFS, NTFS, reiserfs, xfs can handle more without any performance problems. See scache.cnf for cachesize vs swaplevels table.

9.10 Setting the bindaddress

BindAddress is default set up to Many people do not need access SmartCache from others computers and this also closes access from Internet to your computer or LAN via Smart Cache, which is a Good Thing. No other computers can connect to it. This is very good for security, but not very useful if you have LAN.

If you have an Intranet and want to allow other Intranet computers to connect, enter your Intranet IP address here.

Magic '*' allows any computer (even from Internet) to connect and use your proxy. If you use it, blocking some URLs in you private network will be wise (see URL blocking, Section 5.6). There are no known security holes.

For fine-tuning access to your proxy server see Proxy access control, Section 9.11.

9.11 Proxy access control

Proxy access may be limited by using "allow" configuration statement with the host ip address as a parameter to grant access only to specified hosts. Hosts that are not allowed to access proxy server will receive "403 Cache access denied" response. If no "allow" statements are found then any host is allowed to access the proxy.

Allow statement can have optionally netmask or number of bits appended.

     # allow proxy access only from computer it is running at and from addresses
     # and

[ previous ] [ Contents ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ 9 ] [ 10 ] [ 11 ] [ 12 ] [ 13 ] [ 14 ] [ 15 ] [ 16 ] [ A ] [ B ] [ next ]

Smart Cache Manual


Radim Kolar hsn_nospam_at.sendmail.dot.cz