Connection aborted error using python elasticsearch with large files on AWS ES

AWS ES has an upload limit of 10MB. If you are using the bulk helpers or reindex and some documents are above this limit you will get an error ConnectionError: ('Connection aborted.', error(32, 'Broken pipe')).

To solve it, use the max_chunk_bytes argument, which can be used with reindex like so:

es, source_index, target_index, chunk_size=100,
bulk_kwargs={'max_chunk_bytes': 10048576},  # 10MB AWS ES upload limit

Ideally make the chunk size the average number of documents before the size is 10MB, and then in the case there are some larger documents that push the size over 10MB the elasticsearch library will handle it.

Error running vagrant provisioner when default vagrant share is disabled

If you have disabled the default vagrant shared directory

[sourcecode language=’ruby’]
config.vm.synced_folder ‘.’, ‘/vagrant’, disabled: true

You will get the following error using a provisioner:

==> default: Running provisioner: ansible_local…
default: Installing Ansible…
default: Running ansible-playbook…
cd /vagrant && PYTHONUNBUFFERED=1 ANSIBLE_FORCE_COLOR=true ansible-playbook –limit="default" –inventory-file=/tmp/vagrant-ansible/inventory -v /ansible/init.yml
bash: line 3: cd: /vagrant: No such file or directory
Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.

To fix, set the provisioning_path configuration:
[sourcecode language=’ruby’]
config.vm.provision “ansible_local” do |ansible|
ansible.provisioning_path = “/ansible”

Importing/restoring elastic search snapshot to AWS Elastic Search Service

Took me a long time to find out how to do this.

A few people have re-posted a lot of this AWS article but missed out some crucial details:

The general idea is:

  1. Create an AWS bucket and put the snapshot files into it (don’t use a subdirectory, the .dat files should be in the bucket root).  No need to change permissions on the bucket or anything.
  2. Create an IAM role and policy as per the documentation in the AWS docs link above.  When creating the role using web management console you need to choose EC2 role type and manually modify the trust relationship after creating it.
  3. Run a python script (can find this in the docs link above) using the boto library to register the bucket as a snapshot repository in ES.  You need to sign the request regardless of the ES access policy you are using.  HOWEVER set `is_secure` to `True`.  Without this I was getting `<html></html>` returned instead of any error messages.
  4. Use curl to do the restore (no need to sign restore/backup requests if your access policy is open / IP-based).  Again check the doc for the exact curl command, but as above use https instead of http to get real error messages.

Mount docker socket inside AWS container

  1. To your container, add a new volume
  2. Name: ‘docker_sock’, source path: ‘/var/run/docker.sock’
  3. In Storage and Logging section, add new mount point
  4. Select ‘docker_sock’, container path: ‘/var/run/docker.sock’

And that’s it. No need to give privileged access, and if you run docker commands directly from inside the container there’s no need to change IAM policy.

Associating EC2 instances with an ECS cluster

The EC2 instance is associated with a Container Service cluster using the /etc/ecs/ecs.config file on the instance, in the format ECSCLUSTER=yourcluster_name.

The EC2 instance must also have the ECS agent installed. If you create the instance using the ECS AMI this will be pre-installed (search for AMI called amazon-ecs-optimized).

This configuration can be put in the User Data field:

echo ECS_CLUSTER=your_cluster_name >>/etc/ecs/ecs.config

To find the setting on an instance that already exists: Actions -> Instance Settings -> View/Change User Data

Exact instructions for setting up the EC2 instance properly can be found here:

Running high trust apps / plugins with Sharepoint Foundation 2013

The User Profile service is not available on Sharepoint Foundation and therefore high trust apps are not supported.   However I found that after installing a high trust app it worked until restarting IIS, and after that it would generate authentication errors.  Re-uploading the app package to the app site fixed the authentication errors and allowed the app to run properly again.

I could not find an API to automate this re-upload process, so instead it can be done using a headless browser.

Here’s an example in casperjs (note that on Windows casperjs must be installed in a path without spaces):

[sourcecode language=’js’]

var url = ‘’
var file = ‘’
var user = ‘username’
var pass = ‘password’

var casper = require(‘casper’).create();

casper.setHttpAuth(user, pass);

casper.thenOpen(url, function() {
this.waitForSelector(‘#idHomePageNewDocument-WPQ2’, function() {
this.echo(“Found selector”);

casper.thenClick(‘#idHomePageNewDocument-WPQ2’, function() {
this.echo(“Clicked button”);
this.waitForSelector(‘.ms-dlgFrameContainer > iframe’, function() {
this.echo(“Got the iframe”);

casper.withFrame(1, function() {
this.waitForSelector(‘#aspnetForm’, function() {
this.echo(“Found form”);
this.fill(‘#aspnetForm’, {
‘ctl00$PlaceHolderMain$ctl01$ctl04$InputFile’: file,
}, false);
this.wait(3000, function() {‘#ctl00PlaceHolderMainctl00RptControlsbtnOK’);
this.echo(“Clicked button”);


The term ‘Get-SPApplicationPrincipal’ is not recognized as the name of a cmdlet, function, script file or operable program.

Ignore what it says on this technet page:

The command is actually Get-SPAppPrincipal.

You can see the correct command here:



Handling HTTP status code 100 in Scrapy

You might have some problems handling the 100 response code in Scrapy.  Scrapy uses Twisted on the backend, which itself does not handle status code 100 properly yet:

The remote server will first send a response with the 100 status code, then a response with 200 status code.  In order to get the 200 response code, sent the following header in your spider:

‘Connection’: ‘close’

If your 200 response is also gzipped, Scrapy might not gunzip, in which case you need to set the following header as well:

‘Accept-Encoding’: ”

And if Scrapy will not do anything with the responses at all, you might need to set the following Spider attribute:

handle_httpstatus_list = [100]

Oh noes! Kivy dependency installation has killed my graphics

If you are in the habit of blindly installing packages onto your personal machine while following installation instructions for some new shiny program, and that current shiny program is Kivy (1.9) and your version of xorg is a bit out of date, you may find a black screen the next time you boot.

This is due to the following kivy dependencies which have upgraded your xorg dependencies and broken xorg:


When you try and start xorg, you will get an error similar to:

No such file or directory: /usr/bin/x


To fix, upgrade xorg to quantal version:

apt-get install xserver-xorg-lts-quantal


I rebooted into failsafe graphics mode, then normal mode (both failed), then normal mode again and everything was working.