Squid3 Proxy on Ubuntu

Using a personal proxy server can be helpful for a variety of reasons, such as:

  • Performance – network speed and bandwidth
  • Security – filtering and monitoring
  • Debugging – to trace activity

Here are some simple steps to get you started,  obviously you will need to further “harden” security to make it production ready!


sudo apt-get install squid3


cd /etc/squid3/
sudo mv squid.conf squid.orig
sudo vi squid.conf

NOTE: the following configuration works, but will likely need to be adapted for your specific usage.


http_port 3128
visible_hostname proxy.EXAMPLE.com
auth_param digest program /usr/lib/squid3/digest_file_auth -c /etc/squid3/passwords
#auth_param digest program /usr/lib/squid3/digest_pw_auth -c /etc/squid3/passwords
auth_param digest realm proxy
auth_param basic credentialsttl 4 hours
acl authenticated proxy_auth REQUIRED
acl localnet src 10.0.0.0/8 # RFC 1918 possible internal network
acl localnet src 172.16.0.0/12 # RFC 1918 possible internal network
acl localnet src 192.168.0.0/16 # RFC 1918 possible internal network
acl localnet src fc00::/7 # RFC 4193 local private network range
acl localnet src fe80::/10 # RFC 4291 link-local (directly plugged) machines
#acl SSL_ports port 443
#http_access deny to_localhost
#http_access deny CONNECT !SSL_ports
http_access allow localnet
http_access allow localhost
http_access allow authenticated
via on
forwarded_for transparent

Create the users and passwords:

sudo apt-get install apache2-utils (required for htdigest)
sudo htdigest -c /etc/squid3/passwords proxy user1
sudo htdigest /etc/squid3/passwords proxy user2

Open up firewall port (if enabled):

sudo ufw allow 3128

Restart the server and tail the logs:

sudo service squid3 restart
sudo tail -f /var/log/squid3/access.log

OTHER FILE LOCATIONS:

/var/spool/squid3
/etc/squid3

MONITORING with Splunk…

sudo /opt/splunkforwarder/bin/splunk add monitor /var/log/squid3/access.log -index main -sourcetype Squid3
sudo /opt/splunkforwarder/bin/splunk add monitor /var/log/squid3/cache.log -index main -sourcetype Squid3

REFERENCES:

HTML5 Link Prefetching

Link prefetching is used to identify a resource that might be required by the next navigation, and that the user agent SHOULD fetch, such that the user agent can deliver a faster response once the resource is requested in the future.


<link rel="prefetch" href="http://www.example.com/images/sprite.png" />

<link rel="prefetch" href="/images/sprite.png" />

Supported in:

  • MSIE 11+/Edge
  • Firefox 3.5+ (for HTTPS)
  • Chrome
  • Opera

REFERENCES:

HTML5 preconnect

In addition to dns-prefetch, you can take browser performance one step further by actually creating a new connection to a resource.

By initiating an early connection, which includes the DNS lookup, TCP handshake, and optional TLS negotiation, you allow the user agent to mask the high latency costs of establishing a connection.

Supported in:

  • Firefox 39+ (Firefox 41 for crossorigin)
  • Chrome 46+
  • Opera


<link rel="preconnect" href="//example.com" />
<link rel="preconnect" href="//cdn.example.com" crossorigin />

REFERENCES:

HTML5 DNS prefetch

I often get into some fringe areas of micro-optimizations of website performance, DNS prefetching is another one of those topics.

To understand how this can help, you must first understand the underlying concepts that are used within the communications used to build your web page.

The first of these is a “DNS Lookup”, where the domain name (www.example.com) is converted into a numerical address, the IP address of the server that contains the file(s).

In many websites, content is included from other domains for performance or security purposes.

When the domain names are known in advance, this approach can save time on the connection as the lookup can fetched in advance, before it is required on the page to retrieve assets.

This can be particularly useful for users with slow connections, such as those on mobile browsers.


<link rel="dns-prefetch" href="//www.example.com" />

Supported in:

  • MSIE9+ (MSIE10+ as dns-prefetch)/Edge
  • Firefox
  • Chrome
  • Safari
  • Opera

REFERENCES:

Redirect within a javascript file

There often comes a time when you are working on a large project and find a need to refactor javascript resources. Unfortunately, if those assets are accessed by 3rd parties or other code you cannot easily update, you might find yourself stuck.

If you have access to the Tier1 (HTTP server such as ApacheHTTPd) you can often do this within the httpd.conf, or an .htaccess file update. If not, you can always do a simple function within the old javascript file itself, such as the one below.

Put this in the old javascript file location, it is in a closure to prevent the variables from “leaking” into the global namespace.


/* MOVED */
(function(){
"use strict";
var u='/js/newfile.js';
var t=document.createElement('script');t.type='text/javascript';t.src=u;
var s=document.getElementsByTagName('script')[0];s.parentNode.insertBefore(t,s);
})();

Brotli Compression

If you look at HTTP Headers as often as I do, you’ve likely noticed something different in Firefox 44 and Chrome 49. In addition to the usual ‘gzip’, ‘deflate’ and ‘sdhc’ , a new value ‘br’ has started to appear for HTTPS connections.

Request:

Accept-Encoding:br

Response:

Content-Encoding:br

Compared to gzip, Brotli claims to have significantly better (26% smaller) compression density woth comparable decompression speed.

The smaller compressed size allows for better space utilization and faster page loads. We hope that this format will be supported by major browsers in the near future, as the smaller compressed size would give additional benefits to mobile users, such as lower data transfer fees and reduced battery use.

Advantages:

  • Brotli outperforms gzip for typical web assets (e.g. css, html, js) by 17–25 %.
  • Brotli -11 density compared to gzip -9:
  • html (multi-language corpus): 25 % savings
  • js (alexa top 10k): 17 % savings
  • minified js (alexa top 10k): 17 % savings
  • css (alexa top 10k): 20 % savings


NOTE: Brotli is not currently supported Apache HTTPd server (as of 2016feb10), but will likely be added in an upcoming release.

http://mail-archives.apache.org/mod_mbox/httpd-users/201601.mbox/%[email protected]%3E

Until there is native support, you can pre-compress files by following instructions here…
https://lyncd.com/2015/11/brotli-support-apache/

REFERENCES:

Google and Facebook bypassing P3P User Privacy Settings

I wrote about P3P a very long time ago, and have implemented it on several websites. Some history, the W3C crafted the P3P policy.
Microsoft introduced P3P support in IE6 (in 2001) and it remains implemented in all current versions of the browser. The primary intended use is to block 3rd party cookies within the browser on behalf of the user.

Interesting enough, Microsoft has had been a bit of a struggle with Google and Facebook, which send the following HTTP response headers.

Google’s Response:

P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."

Facebook’s response:

P3P: CP="Facebook does not have a P3P policy. Learn why here: http://fb.me/p3p"

REFERENCES:

Install Google mod_pagespeed for Apache HTTP

Website network performance is usually a very complicated process. Over the years, I’ve outlined many development techniques that can be used toward this goal. I’d heard about mod_pagespeed for some time, but never had the opportunity to experiment with it until recently. My first impression is that it is a VERY EASY means to gain performance improvements without reworking your existing website to implement techniques for establishing far-future expires, cache-busting, minification and static file merging.

Out of the box, most common techniques are automatically applied to your assets and a local server cache is created to utilize them.

Default installation is trivial:

  1. Download the latest version for your server architecture:

    wget https://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-stable_current_amd64.deb

    OR

    wget https://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-stable_current_i386.deb

  2. sudo dpkg -i mod-pagespeed-*.deb

  3. sudo apt-get -f install
    (if required)

  4. sudo service apache2 restart

NOTE: Using tools like Firebug will enable you to see an HTTP Header indicating the version being used for your requests.

If you need to modify configuration from the default:

sudo vi /etc/apache2/mods-available/pagespeed.conf

For VirtualDomains, you can selectively enable and disable PageSpeed and many of it’s settings in your appropriate configuration files with:

<IfModule pagespeed_module>
ModPagespeed on
</IfModule>

NOTE: Appending ?ModPagespeed=off to your URL will disable functions for that request.

REFERENCES:

Java User-Agent detector and caching

It’s often important for a server side application to understand the client platform. There are two common methods used for this.

  1. On the client itself, “capabilities” can be tested.
  2. Unfortunately, the server cannot easily test these, and as such must usually rely upon the HTTP Header information, notably “User-Agent”.

Example User-agent might typically look like this for a common desktop browser, developers can usually determine the platform without a lot of work.

"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.3)"

Determining robots and mobile platforms, unfortunately is a lot more difficult due to the variations. Libraries as those described below simplify this work as standard Java Objects expose the attributes that are commonly expected.

With Maven, the dependencies are all resolved with the following POM addition:

<dependency>
<groupid>net.sf.uadetector</groupid>
<artifactid>uadetector-resources</artifactid>
<version>2014.10</version>
</dependency>


/* Get an UserAgentStringParser and analyze the requesting client */
final UserAgentStringParser parser = UADetectorServiceFactory.getResourceModuleParser();
final ReadableUserAgent agent = parser.parse(request.getHeader("User-Agent"));

out.append("You're a '");
out.append(agent.getName());
out.append("' on '");
out.append(agent.getOperatingSystem().getName());
out.append("'.");

As indicated on the website documentation, running this query for each request uses valuable server resources, it’s best to cache the responses to minimize the impact!

http://uadetector.sourceforge.net/usage.html#improve_performance

NOTE: the website caching example is hard to copy-paste, here’s a cleaner copy.


/*
* COPYRIGHT.
*/
package com.example.cache;

import java.util.concurrent.TimeUnit;
import net.sf.uadetector.ReadableUserAgent;
import net.sf.uadetector.UserAgentStringParser;
import net.sf.uadetector.service.UADetectorServiceFactory;
import com.google.common.cache.Cache;
import com.google.common.cache.CacheBuilder;

/**
* Caching User Agent parser
* @see http://uadetector.sourceforge.net/usage.html#improve_performance
* @author Scott Fredrickson [skotfred]
* @since 2015jan28
* @version 2015jan28
*/
public final class CachedUserAgentStringParser implements UserAgentStringParser {

private final UserAgentStringParser parser = UADetectorServiceFactory.getCachingAndUpdatingParser();
private static final int CACHE_MAX_SIZE = 100;
private static final int CACHE_MAX_HOURS = 2;
/**
* Limited to 100 elements for 2 hours!
*/
private final Cache<String , ReadableUserAgent> cache = CacheBuilder.newBuilder().maximumSize(CACHE_MAX_SIZE).expireAfterWrite(CACHE_MAX_HOURS, TimeUnit.HOURS).build();

/**
* @return {@code String}
*/
@Override
public String getDataVersion() {
return parser.getDataVersion();
}
/**
* @param userAgentString {@code String}
* @return {@link ReadableUserAgent}
*/
@Override
public ReadableUserAgent parse(final String userAgentString) {
ReadableUserAgent result = cache.getIfPresent(userAgentString);
if (result == null) {
result = parser.parse(userAgentString);
cache.put(userAgentString, result);
}
return result;
}
/**
*
*/
@Override
public void shutdown() {
parser.shutdown();
}
}

REFERENCES:

HTTP Strict Transport Security (HSTS)

The HTTP Strict Transport Security feature lets a web site inform the browser that it should never load the site using HTTP, and should automatically convert all attempts to access the site using HTTP to HTTPS requests instead.

Example Use case:
If a web site accepts a connection through HTTP and redirects to HTTPS, the user in this case may initially talk to the non-encrypted version of the site before being redirected, if, for example, the user types http://www.foo.com/ or even just foo.com.

Problem:
This opens up the potential for a man-in-the-middle attack, where the redirect could be exploited to direct a user to a malicious site instead of the secure version of the original page.

Risk:
For HTTP sites on the same domain it is not recommended to add a HSTS header but to do a permanent redirect (301 status code) to the HTTPS site.

Bonus:
Google is always “tweaking” their search algorithms, and, at least at present time, gives greater weight to secure websites.


# Optionally load the headers module:
LoadModule headers_module modules/mod_headers.so

<VirtualHost *:443>
# Guarantee HTTPS for 1 Year including Sub Domains
Header always set Strict-Transport-Security "max-age=31536000; includeSubdomains; preload"
</VirtualHost>

Then you might (optionally, but recommended) force ALL HTTP users to HTTPS:

# Redirect HTTP connections to HTTPS
<VirtualHost *:80>
ServerAlias *
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{HTTPS} off
#RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
RewriteRule ^(.*)$ https://%{HTTP_HOST}$1 [redirect=301]
</IfModule>
</VirtualHost>

That’s it…

REFERENCES: