Squid3 Proxy on Ubuntu

Using a personal proxy server can be helpful for a variety of reasons, such as:

  • Performance – network speed and bandwidth
  • Security – filtering and monitoring
  • Debugging – to trace activity

Here are some simple steps to get you started,  obviously you will need to further “harden” security to make it production ready!


sudo apt-get install squid3


cd /etc/squid3/
sudo mv squid.conf squid.orig
sudo vi squid.conf

NOTE: the following configuration works, but will likely need to be adapted for your specific usage.


http_port 3128
visible_hostname proxy.EXAMPLE.com
auth_param digest program /usr/lib/squid3/digest_file_auth -c /etc/squid3/passwords
#auth_param digest program /usr/lib/squid3/digest_pw_auth -c /etc/squid3/passwords
auth_param digest realm proxy
auth_param basic credentialsttl 4 hours
acl authenticated proxy_auth REQUIRED
acl localnet src 10.0.0.0/8 # RFC 1918 possible internal network
acl localnet src 172.16.0.0/12 # RFC 1918 possible internal network
acl localnet src 192.168.0.0/16 # RFC 1918 possible internal network
acl localnet src fc00::/7 # RFC 4193 local private network range
acl localnet src fe80::/10 # RFC 4291 link-local (directly plugged) machines
#acl SSL_ports port 443
#http_access deny to_localhost
#http_access deny CONNECT !SSL_ports
http_access allow localnet
http_access allow localhost
http_access allow authenticated
via on
forwarded_for transparent

Create the users and passwords:

sudo apt-get install apache2-utils (required for htdigest)
sudo htdigest -c /etc/squid3/passwords proxy user1
sudo htdigest /etc/squid3/passwords proxy user2

Open up firewall port (if enabled):

sudo ufw allow 3128

Restart the server and tail the logs:

sudo service squid3 restart
sudo tail -f /var/log/squid3/access.log

OTHER FILE LOCATIONS:

/var/spool/squid3
/etc/squid3

MONITORING with Splunk…

sudo /opt/splunkforwarder/bin/splunk add monitor /var/log/squid3/access.log -index main -sourcetype Squid3
sudo /opt/splunkforwarder/bin/splunk add monitor /var/log/squid3/cache.log -index main -sourcetype Squid3

REFERENCES:

HTML5 Link Prefetching

Link prefetching is used to identify a resource that might be required by the next navigation, and that the user agent SHOULD fetch, such that the user agent can deliver a faster response once the resource is requested in the future.


<link rel="prefetch" href="http://www.example.com/images/sprite.png" />

<link rel="prefetch" href="/images/sprite.png" />

Supported in:

  • MSIE 11+/Edge
  • Firefox 3.5+ (for HTTPS)
  • Chrome
  • Opera

REFERENCES:

Java User-Agent detector and caching

It’s often important for a server side application to understand the client platform. There are two common methods used for this.

  1. On the client itself, “capabilities” can be tested.
  2. Unfortunately, the server cannot easily test these, and as such must usually rely upon the HTTP Header information, notably “User-Agent”.

Example User-agent might typically look like this for a common desktop browser, developers can usually determine the platform without a lot of work.

"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.3)"

Determining robots and mobile platforms, unfortunately is a lot more difficult due to the variations. Libraries as those described below simplify this work as standard Java Objects expose the attributes that are commonly expected.

With Maven, the dependencies are all resolved with the following POM addition:

<dependency>
<groupid>net.sf.uadetector</groupid>
<artifactid>uadetector-resources</artifactid>
<version>2014.10</version>
</dependency>


/* Get an UserAgentStringParser and analyze the requesting client */
final UserAgentStringParser parser = UADetectorServiceFactory.getResourceModuleParser();
final ReadableUserAgent agent = parser.parse(request.getHeader("User-Agent"));

out.append("You're a '");
out.append(agent.getName());
out.append("' on '");
out.append(agent.getOperatingSystem().getName());
out.append("'.");

As indicated on the website documentation, running this query for each request uses valuable server resources, it’s best to cache the responses to minimize the impact!

http://uadetector.sourceforge.net/usage.html#improve_performance

NOTE: the website caching example is hard to copy-paste, here’s a cleaner copy.


/*
* COPYRIGHT.
*/
package com.example.cache;

import java.util.concurrent.TimeUnit;
import net.sf.uadetector.ReadableUserAgent;
import net.sf.uadetector.UserAgentStringParser;
import net.sf.uadetector.service.UADetectorServiceFactory;
import com.google.common.cache.Cache;
import com.google.common.cache.CacheBuilder;

/**
* Caching User Agent parser
* @see http://uadetector.sourceforge.net/usage.html#improve_performance
* @author Scott Fredrickson [skotfred]
* @since 2015jan28
* @version 2015jan28
*/
public final class CachedUserAgentStringParser implements UserAgentStringParser {

private final UserAgentStringParser parser = UADetectorServiceFactory.getCachingAndUpdatingParser();
private static final int CACHE_MAX_SIZE = 100;
private static final int CACHE_MAX_HOURS = 2;
/**
* Limited to 100 elements for 2 hours!
*/
private final Cache<String , ReadableUserAgent> cache = CacheBuilder.newBuilder().maximumSize(CACHE_MAX_SIZE).expireAfterWrite(CACHE_MAX_HOURS, TimeUnit.HOURS).build();

/**
* @return {@code String}
*/
@Override
public String getDataVersion() {
return parser.getDataVersion();
}
/**
* @param userAgentString {@code String}
* @return {@link ReadableUserAgent}
*/
@Override
public ReadableUserAgent parse(final String userAgentString) {
ReadableUserAgent result = cache.getIfPresent(userAgentString);
if (result == null) {
result = parser.parse(userAgentString);
cache.put(userAgentString, result);
}
return result;
}
/**
*
*/
@Override
public void shutdown() {
parser.shutdown();
}
}

REFERENCES:

Cloudflare CDN (Content Delivery Network)

Best practices for web applications often call for the use of a CDN. Those of you that have worked with YSlow! are likely very accustomed to seeing warnings for this reason. I’ve found that CloudFlare is very easy to setup, and for basic services costs absolutely nothing. In addition to the obvious performance advantages of using a CDN to offload much of your network traffic, it also has the advantage of improved security.

CDN’s work by caching a copy of your static content at several locations around the world, making it closer and faster for your users.

Implementation takes only minutes as it requires that you:

  1. create a (free) account,
  2. retrieve your existing DNS values from your current provider,
  3. determine direct vs. CDN “cloud” routing for each subdomain,
  4. change your DNS records to point to the CloudFlare DNS servers

Some additional advantages I’ve seen since implementing:

  • Site remains available in limited capability to users during server outages or upgrades.
  • Simplified network configuration as all requests can be sent outside of the LAN for users local to the servers
  • IPv6 dual-stack support

REFERENCES:

window.location.reload(true);

While working on legacy apps, you might occasionally find that a developer has written a function to reload the existing page. I’ve found that in many cases, the optional javascript argument is neglected in this case and thus the ‘cached’ copy of the page is presented, often showing stale or invalid data. The solution here is simple, always specify ‘true’ to force the page to be requested from the server.


<script type="text/javascript">
window.location.reload(true);//NOTE: 'true' forces NON-cached copy to be returned
</script>

REFERENCES:

DNS Prefetching

DNS is much like a phone book for the internet. For each domain name (or subdomain like ‘www’), there is an IP address that resembles a phone number. Getting the matching number for each domain can take some time and make your site appear slow, particularly on mobile connections. Fortunately, you can pre-request this information and speed up your site in most cases.

To enable a domains DNS lookup to be performed in advance of the request, you can add a single line to the <head> section of your page.

<link rel="dns-prefetch" href="//giantgeek.com" />

If you want to explicitly turn on (or off) this behavior, you can add one of the following, or their HTTP Header equivalents:

<meta http-equiv="x-dns-prefetch-control" content="on" />
<meta http-equiv="x-dns-prefetch-control" content="off" />

This is supported in all modern browsers:

  • Firefox 3.5+
  • Safari 5.0+
  • MSIE 9.0+

If should be noted that a similar method can be used to prefetch as page, but I will save that for a different article:
<link rel="prefetch" href="http://www.skotfred.com/" />

REFERENCES:

HTTP ETag Header

These are useful for some advanced caching behavior, but there are cases where you might find them unnecessary for static files (in particular). Most network analysis tools will call attention to this header value, and while it seems like a trivial amount of bandwidth to send from the server to the client, the real reason for the negative score is more likely related to the behaviors that it causes in the client.

It should be noted that the default value used for the ETag is based upon the ‘inode’ of the file, as such it’s IS problematic in clustered server environments. I’ve shown the correction for this below.

Adding the following to your Apache http.conf file is a start:


# Change ETag to remove the iNode (for multi-server environments)
FileETag MTime Size

#Remove ETag from all static content, this could be done globally without the FilesMatch, but we want better control.
<FilesMatch "\.(html|htm|js|css|gif|jpe?g|png|pdf|txt|zip|7z|gz|jar|war|tar|ear|java|pac)$">
<IfModule header_module>
Header unset ETag
</IfModule>
</FilesMatch>

REFERENCES:

Cheers.

HTML5 offline appcache manifest

In my testing, you don’t need to fully embrace HTML5 markup to take advantage of the “offline” functionality, you simply need to add the attribute and related files to your existing website/page. Any modern browser that supports HTML5 should automatically recognize the offline content and use it when appropriate, unfortunately no version of MSIE yet supports this.

<html manifest="/offline.appcache">

In that file, you must then specify the offline behavior, something like this is a good start:

CACHE MANIFEST
#This is to provide minimal HTML5 offline capabilities
#MIME mapping must be 'appcache=text/cache-manifest'
#Reference to this file is per page, you can have different ones in an app.
#Common image files and css may be 'cached'
CACHE:
/index.html
/css/offline.css
NETWORK:
*
FALLBACK:
/ /offline.html

On the server side, you’ll have to serve up that file with the appropriate MIME type (text/cache-manifest, for ApacheHTTPD you simply need to add one line to httpd.conf:

AddType text/cache-manifest .appcache

REFERENCES:

Linux/Windows file cleanup

If you make heavy (or even typical) use of your computer, you’ll often notice that it just doesn’t seem as fast as it once was. For a slight increase in performance, disk space and to generally remove some of the ‘temporary’ files/cruft that are routinely written to disk you have a few options.

Here are a few of my current favorites for doing ‘Spring Cleaning’ on my computers… BleachBit and CCleaner

BleachBit is available on all major platforms (Windows, OS/X, Linux).

Flash Cookies / Website Storage

If you’ve been online at all in the last decade, you’ve heard of the “dangers” of HTTP Cookies. More nefarious and harder to remove are Flash Cookies as they are handled by a plugin/extension/addon to the browser and exist outside of the normal security settings.

To see or delete Flash data, you’ve got to visit the following URL:
http://www.macromedia.com/support/documentation/en/flashplayer/help/settings_manager07.html

You will probably be suprised to see many of the sites listed, as Flash is often being used to present you with ads in addition to the interactive elements that you might expect.

REFERENCES: