All posts by Adam Goerdt

A day at saw mill ‘Het Jonge Schaap’

A series of active wind powered sawmills in the Netherlands
A series of active wind powered sawmills
This particular mill is named 'Het Jonge Schaap' (The Young Sheep).
This particular mill is named ‘Het Jonge Schaap’ (The Young Sheep).
A look inside the mill. A log is waiting to be sawn into boards on the sled next to the support beam.
A look inside the mill. A log is waiting to be sawn into boards on the sled next to the support beam.
Inside the mill's cap. While the rest of the structure is stationary, the cap can be turned into the wind when needed. The sails are affixed to the cast iron axle and drive the giant 'brake wheel'. Its name is derived from the curved wooden pieces around the wheel that act as a brake.
Inside the mill’s cap. While the rest of the structure is stationary, the cap can be turned into the wind when needed. The sails are affixed to the cast iron axle and drive the giant ‘brake wheel’. Its name is derived from the curved wooden pieces around the wheel that act as a brake.
The brake is operated by a giant lever which weight constricts the wooden blocks against the wheel. This lever is operated by the long stick you see coming out the back of the cap on previous pictures.
The brake is operated by a giant lever which weight constricts the wooden blocks against the wheel. This lever is operated by the long stick you see coming out the back of the cap on previous pictures.
The transmission from the brake wheel to the main shaft is with 2:1 quite low for a windmill. This is needed because of the enormous mass of the machinery that has to be kept in motion.
The transmission from the brake wheel to the main shaft is with 2:1 quite low for a windmill. This is needed because of the enormous mass of the machinery that has to be kept in motion.
A floor below lies the crankshaft. Here the rotating motion of the axle is transformed into a piston like motion for the saws
A floor below lies the crankshaft. Here the rotating motion of the axle is transformed into a piston like motion for the saws
The saw pistons have levers attached to them to drive various other machinery below.
The saw pistons have levers attached to them to drive various other machinery below.
Sails partially unfurled waiting for the wind
Sails partially unfurled waiting for the wind
The log to be sawn is affixed to a wooden sled. Which each upward motion of the saw it is pushed 1 mm into the saw blades.
The log to be sawn is affixed to a wooden sled. Which each upward motion of the saw it is pushed 1 mm into the saw blades.
The adjustable saws work through the logs by about 2.5 meter per hour. A tremendous increase in productivity compared to manual labor.
The adjustable saws work through the logs by about 2.5 meter per hour. A tremendous increase in productivity compared to manual labor.
The sled is also moved by wind power. With each upwards movement of the saw an iron claw grabs the toothed wheel and turns it a bit.
The sled is also moved by wind power. With each upwards movement of the saw an iron claw grabs the toothed wheel and turns it a bit.
Some of the boards produced by the mill
Some of the boards produced by the mill

 

Found on Imgur

Advanced Load Testing with Gatling

Gatling is an incredibly useful tool for stress testing web applications based on real-world user interaction and automatically outputs beautiful dynamic HTML reports to allow you to analyze the results of the tests.  To start testing with Gatling download the latest version from the Gatling project page and unzip the contents. In the ‘bin’ directory of the extracted folder are two sets of executables: gatling and recorder.  The Recorder application stands of a proxy server which will record all user interaction with the target site, and the Gatling application is used to replay the recorded session.  If you’re not already familiar with recording and replaying scenarios the Gatling Quickstart guide is an excellent resource to get you up to speed.

For most websites the basic ‘Record’ and ‘Replay’ functionality works well, but if you need to POST data to a site with CSRF protection you’ll need to update the auto-generated code to account for it.  I was recently tasked with load testing a customer application prior to release to the public with just this type of protection.  The site was built using Java Server Faces and each POST request is protected by a javax.faces.ViewState parameter which acts as a CSRF token and is updated every page load.  If the correct viewstate is not submitted the POST will fail.

Handling the ViewState

To get around this issue three new functions are required to capture the current viewstate from the previous page request and submit it on subsequent requests:

val jsfViewStateCheck = regex("""id="j_id1:javax.faces.ViewState:1" value="([^"]*)"""")
    .saveAs("viewState")
def jsfGet(name: String, url: String) = http(name).get(url)
    .check(jsfViewStateCheck)
def jsfPost(name: String, url: String) = http(name).post(url)
    .formParam("javax.faces.ViewState", "${viewState}")
    .check(jsfViewStateCheck)

The first item is a check function which does a regex match on the page looking for the ViewState (Note: If there is no viewstate present the check will report as an Error and show up as a KO in the report).

The second function is an override of the Gatling GET method which checks the response data using the check method and sets the “${viewState}” variable.

The third function is an override of the Gatling POST method and is similar to the GET method in that it will set the “${viewState}” variable, but also adds the previous “${viewState}” as a form parameter.

To implement these functions in your load test you will need to alter all of the recorded GET and POST requests to use the new functions as shown below:

GET

The GET request code from the recorded session

exec(http("request_identifier")
    .get("http://jsfsite.com/page"))

would be updated to

exec(jsfGet("request_identifier","http://jsfsite.com/page"))

POST

The POST request code from the recorded session

exec(http("request_identifier")
    .post("http://jsfsite.com/page"))

would be updated to

exec(jsfPost("request_identifier","http://jfssite.com/page")
    .headers(headers_0)
    .formParam("form_field", "payload"))

The jsfPost method will automatically append the viewstate form parameter to the request and the request will succeed.

Handling Uploads

File upload requests need to be handled a little bit differently.  HTML file upload forms set the enctype=”multipart/form-data” attribute which changes how browsers submit the data to the web server.  Instead of sending the data as the POST payload the file contents an form field data are sent delimited by a random numerical string called a boundary (Here’s the RFC for those curious).  In order to send the file to the web server we need to construct this type of request.  To do so requires different methods than the formParam method we used previously as shown below.

exec(http("upload_request")
    .post("http://jsfsite.com/upload")
    .headers(headers_0)
    .bodyPart(StringBodyPart("form_field_1","payload"))
    .bodyPart(StringBodyPart("form_field_2","payload"))
    .bodyPart(StringBodyPart("javax.faces.ViewState","${viewState}"))
    .bodyPart(RawFileBodyPart("file_upload_file","some_file.txt"))
 )
.check(jsfViewStateCheck)

Notice that instead of using the formParam method to set the fields we use the bodyPart method instead.  This tells Gatling to submit the payload as a multipart form request instead of a standard POST. To set the form fields new BodyParts must be created depending on the type of data.  String fields use the StringBodyPart and files use the RawRileBodyPart.  The file that you wish to upload should be located in the $GATLING_HOME\user-files\bodies\ directory.

A final call to the jsfViewStateCheck function is used to update the “${viewState}” variable once the request completes.

Processing 10 Million SQL Rows in a Reasonable Amount of Time

Overview

A while ago I was talking with someone about high performance MySQL trying to figure out how fast we could make updates to the database.  This conversation later led me to my lab where I devised a simple scenario to see just how fast we could push the database:

  • Create a simple 2-column table consisting  of a primary key and a 45 character string of text
  • Populate the table with 10 million rows of random data
  • Alter the table to add a varchar(255) ‘hashval’ column
  • Update every row in the table and set the new column with the SHA-1 hash of the text

The end result was surprising: I could perform SQL updates on all 10 million rows in just over 2 minutes.  If I could lock the tables so that no other processes could write to them during script execution I could perform the update in 30 seconds.

Operating Environment

Below is the hardward and software setup that I used to run the test.

Operating System Fedora 21 (4.1.6-100.fc21.x86_64)
MySQL Version 5.6.27
CPU Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz
RAM 8GB
Storage 2x OCZ-SOLID3 (RAID 0)

~640 MB/s reads

~775 MB/s writes

Java Version openjdk version “1.8.0_65”

Let’s Get Started

The first thing we need to do is to create a table to store the records:

CREATE TABLE `bulktest` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `text` varchar(45) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

Now that the table is created we need some code to populate it.

I started writing code to insert the data in a fairly standard manner: Create a prepared statement; set the parameters; execute the statement.  Each thread was programmed to open a database connection, execute 1000 inserts, then close the connection.  This approach suffered from some severe performance issues, however; I was only able to insert about 1000 rows/second per thread using this method, and even with multiple threads the maximum throughput was still only around 9000 rows/second.  I also tried changing the number of rows inserted per thread (ranging from 500 to 10000), but it didn’t appreciably alter the result.  Using this method it would take about 19 minutes to populate the table.  I went about designing a better method.

Enter Bulk Inserts

If you’ve ever looked inside the output of a mysqldump file you would see that the insert statements are written in such a way that multiple rows are inserted for each statement as shown below:

INSERT INTO table VALUES (1,23),(2,34),(3,33),(4,39),(5,43),(6,58);

The MySQL website has an article which addresses this concept.  I decided to attempt to incorporate this feature into my bulk inserter code.  Here’s the result:

Inserted 10000000 rows in 0 minutes and 29 seconds (344827.6 rows/second)

Wow! That’s a 27x increase! Let’s take a look at the code and see how it’s written.  The following is the run() method in the BulkInserter thread.  The main function instantiated 15 threads and executed the following code 1000 times with a batchsize of 10,000

public void run() {
    StringBuilder sb = new StringBuilder();
    sb.append("insert into bulktest (text) values ");
 
    try{
       Connection conn = BulkProcessor.getConnection();
       for(int i=0;i<batchsize;i++){
           if(i==0){
               sb.append("(?)");
           }
           else{
               sb.append(", (?) ");
           }
        }
 
        PreparedStatement preparedStmt = conn.prepareStatement(sb.toString());
        for(int i=0;i<batchsize;i++){ 
            preparedStmt.setString (i+1, BulkProcessor.getRandomText(45));
        }
 
        preparedStmt.execute();
        preparedStmt.close();
        conn.close();
    }
    catch(Exception e){
        e.printStackTrace();
    }
}

This is a simple function which builds a bulk insert prepared statement and sets the `text` parameter to a random string value. Notice that there are two for loops in the code: one which concatenates the value parameters to the SQL statement and a second which sets the parameters.  This is because the text of the prepared statement must be complete prior to setting the values.

Retreiving Data

To test throughput on the MySQL instance I built a simple select() function to retrieve all of the rows that were inserted.   I ran out of memory when executing the function against the entire table of 10 million rows so I had to limit this test to 8 million rows:

Selected 8000000 rows in 0 minutes and 14 seconds
Avg speed: 571428 rows/sec

I ran a second test where the rows were loaded into an Object to see what type of impact object instantiation would have on performance:

Selected 8000000 rows in 0 minutes and 16 seconds
Avg speed: 500000 rows/sec

A third test appended each object to an ArrayList then removed the item from the list (again because I kept running out of memory):

Selected 8000000 rows in 0 minutes and 16 seconds
Avg speed: 500000 rows/sec

I was surprised to see that the overhead was introduced by instantiating the objects was minimal.

Manipulating Data

At this point I’m ready to update the table with the SHA-1 hash, but first I need to alter the table to accept the hash value.  I timed an alter statement to set a baseline:

$ time mysql -u root -e "ALTER TABLE bulktest.bulktest ADD COLUMN hashval VARCHAR(255) NULL AFTER text;"

real 1m13.954s
user 0m0.003s
sys 0m0.003s

This ALTER statement, which took about 1 minute and 14 seconds, is my baseline theoretical max.  This is the amount of time it takes the database engine to find all 10 million rows and append a <null> value to the row.  You should not be able to beat this time.

Now it’s time to update the data.  I started with a simple method which downloaded the data from the server, hashed the value, and executed an UPDATE prepared statement to re-submit the data to the server.  Performance was, as expected, pretty poor:

Updated 10000000 rows in 30 minutes and 34 seconds
Avg speed: 5452 rows/sec

That is far too long.  I’ve already discovered that I can INSERT and SELECT a large amount of data very quickly, why should UPDATE be any different?

In order to speed up the updates I took advantage of the fact that MySQL has a method to perform a join and update in the same statement. I created a new table, bulktest_temp, which contains the same columns as bulktest.  I kept the SELECT function the same and altered the INSERT function to point to bulktest_temp.  Finally I added another statement to the end of the script:

UPDATE bulktest,bulktest_temp SET bulktest.hashval = bulktest_temp.hashval WHERE bulktest.id = bulktest_temp.id

Executing the script yielded:

Updated 10000000 in 1 minutes and 46 seconds
Avg speed: 94339 rows/second

That’s much better.  It’s also more in line with what I would consider to be reasonable and is right in line with the sum of the time it took to alter the table and select all rows.  However, during the course of developing these scripts I though of a way to process all the data even faster.

Let’s Go Faster

Even though the script is able to process all of the rows much more quickly than I would have though I figured that we could speed up the execution a bit with the constraint that the full table be locked during the script execution.

To do so the code was altered to do the following:

  • LOCK the bulkinsert table
  • Execute the SELECT and INSERT portions of the code, but instead of inserting to a temporary table it executed against a persistent table
  • TRUNCATE `bulkinsert`, then DELETE `bulkinsert`
  • ALTER the newly created table to rename it to `bulkinsert`

The final results were amazing:

Updated 10000000 rows in 0 minutes and 30 seconds
Avg speed: 333333.3 rows/second

Garden Update

I think I planted too many beans. This is one week worth, not counting what we've eaten/given away
Baby Cucumbers
Purple leaves and purple flowers will soon lead to purple peppers

Ash Frame

A coworker wanted to buy one of my wife’s paintings and knew that I did a little bit of woodwork so he asked me to build a frame for it as well.  Below are pictures from the build

This is the board I started with. It’s ash wood about 6 ft in length

 

Here are the boards cut to size. The painting is 24×36. The boards were cut about 2″ wide, and 3.5″ longer than the painting

 

Some of the boards got a bit of blade burn when running them through the table saw.  My blade needs cleaning

 

A little bit of time with a sander and all better

 

The short boards received mitered edges

 

I didn’t get a photo showing how the pieces were attached.  I used mitered half laps cut with a dado blade to join the sections together.  Here’s a picture from a previous frame showing how it looked

 

Here’s a test fit of the half-laps prior to gluing. I also went through and cut a 1/2″ deep, 1/4″ wide rabbet on the back of the frame where the painting will be seated

 

Gluing everything together. The mitered half-laps did a good job of keeping everything square

 

Test fit with the frame. Everything fit perfectly

 

My coworker wanted a white frame to match the decor in his house, but he wanted to be able to see the grain as well. I ended up using 2 coats of a white stain dye base to get the color but still show the grain. Finished it with 3 coats of polyurethane

 

Telephone Switching

I’m fascinated by old technology and the solutions that people came up with to solve their problems without computers. Here we can take a behind the scenes look at how telephone switching worked using relays and pulse counters.

Planer Stand

I recently picked up a new-to-me planer from craigslist and after lugging it home came to the conclusion that I needed a small and mobile planer stand. After a bit of searching through a Google image search for 'planer stand' I stumbled across this
Continuing on to the website I saw that the person who designed it also posted plans. Looks like I need to get to work.
The stand and tables can be cut from a single 4x8 sheet of plywood. I had some MDF leftover from the workbench build so I used have a sheet of plywood and some MDF. If I had to do it again I would have left the MDF in the scrap pile. Here are the four pieces of the frame cut to size
I used a circular saw to cut the angles for the sides. I thought the pieces that I cut off were waste but they proved to be quite useful
Squaring the sides.
Here is the finished frame. The support pieces in the corner were cut from the leftover material from the side panels.
Materials for the infeed and outfeed tables cut to size. The support rails were cut so that the tables will be aligned with the planer. It was a trial-and-error process to get the height correct
Assembled infeed and outfeed tables
Gluing some rails to support the legs
Adding a 90 degree brace for the legs
Hinges were also cut from the leftover material from the side panels
Gluing the hinge to the top of the stand. After the glue dries a single screw is inserted from to bottom for extra support. The plans I found used casters but I went with the HTC2000 base.
After the hinges were affixed to both sides I added the tables. The outfeed table was cut so that it rests about 1/64" below the planer table. I used some shims to raise the infeed table so it rests about 1/64" above the planer table.
Stretching her legs
Folded up for storage

Credit for the design and plans go to Jeff Makiel from the Sawmill Creek forums