I’m nearing the end of a two week trip for work to Japan and wondering why I left here eight years ago, especially after nearly breaking down after my last few hours at work this week. Everyone’s been so nice and friendly, and in the evenings, I’ve had some great nights out with old friends. Sustainably, too, if I weren’t in a hotel.

I wrote a pretty cranky post about it then. Of the 24-ish points I made, I think 5 are still valid, I was wrong about 5, and the remaining 14 no longer apply to me, Japan, or the world at large. (For example, you can’t get away from cell phones anywhere now, and smoking is way down.)

In part, I know it was an emotional decision, running away from a disintegrating relationship – something I didn’t write about then, it was too close to home. I also had a lot of problems with other problems I’ve moved past, or the world has left behind: people being fake, too little motivation to learn Japanese, etc. Basically, I was overwhelmed by life and my emotions. I couldn’t see my way clear.

More interestingly, my Japanese seems to have gotten better with disuse. I’m not fluent by any means but it seems I’m better at grammar, making myself understood in personal and business situations, and suddenly kanji is clicking. (Thanks plane! _@_y)

Would I move back? Yes, for the right opportunity. And I know that, this time, I would do it without writing a tirade about what I don’t like about Canada, the US, or anywhere else I work (Argentina, Brazil, etc.) I wouldn’t be running from, I’d be jogging to.

stunned at ms-plurk-ripoff

it’s no secret that i’m a big fan of plurk, despite my recent absence (social media exhaustion set in). i am especially happy because my plurkbuddy alvin woon moved back east to help promote the service, where it became the #1 microblogging service in China (prior to being firewalled).

not just because i want to see the little guy win, but also because it is just appalling that this occured:

Microsoft China stole Plurk’s UI and code and is pretending it’s their own service.

i could see a 2 bit startup doing this, or some non-multinational heavyweight that figured they could get away with it because they pay their lawyers more than the little guy. but this is just bald-faced theft.

i worked at a company many years ago who had their code stolen, and spent many years in the courts shutting down the competitor started by ex-employees who stole the code. from looking at the code involved, it was obvious it was a copy; in many places, error messages contained the same misspellings!

at the time, the ceo swore that he wouldn’t stop until he won back all of the business he lost to “the thieves,” and sued for damages for every cent lost. realizing they had a losing battle, the founders pled no contest, then the purchasing company settled out of court for about us$285mil all told. sadly, many of the customers they lost probably still use the purchasing company’s software instead; i think that company came out on top in the marketplace (for various other reasons). so my employer was vindicated, but didn’t manage to win back all of the business lost.

Plurk doesn’t have the resources to complete the lawsuit, but i hope that they find some other way to shut this down. it’s a different world now, 10-15 years later; maybe social media itself can stop this assault on the innovator. hopefully it will be before they, too, lose their loyal and active client base to a competitor.

okonomiyaki recipes

Good buddy neillathotep has been bugging me to post about my okonomiyaki escapades – so here you go.

The recipe is really simple. Shred cabbage, green onion, garlic scapes (if you have them!), red ginger (you can buy this pre-shredded), shredded nori. Cut up a bunch of other foods you’d like in your okonomiyaki, such as bacon, mochi, pork, squid, etc. For today’s recipes I made one with bacon, and another with brie and asparagus. Mix up 2 parts flour to 1 part dashi — you can use buckwheat flour if you like, or a mix of white, whole wheat, sweet potato/potato, etc. You can use salted water if you don’t have dashi. Also get one egg per pancake.

Ingredients for okonomiyaki

Ingredients for okonomiyaki

In a bowl mix up the egg with the cabbage and a cup of the flour-water mixture. Heat up a griddle to medium hot. Oil with sesame or sunflower oil. Dump out the egg-cabbage-batter and spread out to a pancake. Top with your special toppings. Use a spatula to press down on the pancake until the bottom is well cooked. Flip the pancake over and repeat the pressing routine until the pancake looks dry in the center when viewed edge-on.

Okonomiyaki viewed edge on. Just about ready!

Okonomiyaki viewed edge on. Just about ready!

Remove from the grill. Top with shredded nori, red ginger, kewpie mayo (if you can find it!), katsuoboshi (dried bonito flakes – ditto) and okonomiyaki sauce (decent collection of recipes here). Cut into 4 pieces and serve.

Ready to eat!

Ready to eat!

CouchDB 0.9.0 bulk document post performance

Based on a tip from my university colleague Chris Teplovs, I started looking at CouchDB for some analytics code I’ve been working on for my graduate studies. My experimental data set is approximately 1.9 million documents, with an average document size of 256 bytes. Documents range in size from approximately 100 to 512 bytes. (FYI, this represents about a 2x increase in size from the raw data’s original form, prior to the extraction of desired metadata.)

I struggled for a while with performance problems in initial data load, feeling unenlightened by other posts, until I cornered a few of the developers and asked them for advice. Here’s what they suggested:

  1. Use bulk insert. This is the single most important thing you can do. This reduced the initial load time from ~8 hours to under an hour, and prevents the need to compact the database.
  2. Baseline time: 42 minutes, using 1,000 documents per batch.

  3. Don’t use the default _id assigned by CouchDB. It’s just a random ID and apparently really slows down the insert operation. Instead, create your own sequence; a 10-digit sequential number was recommended. This bought me a 3x speedup and a 6x reduction in database size.
  4. Baseline time: 12 minutes, again using 1,000 documents per batch.

Using 1,000 documents per batch was a wild guess, so I decided it was time to run some tests. Using a simple shell script and GNU time, I generated the following plot of batch size vs. elapsed time:

Strange bulk insert performance under CouchDB 0.9.0

Strange bulk insert performance under CouchDB 0.9.0

The more-than-exponential growth at the right of the graph is expected; however, the peak around 3,000 documents per batch is not. I was so surprised by the results that I ran the test 3 times – and got consistent data. I’m currently running a denser set of tests between 1,000 and 6,000 documents per batch to qualify the peak a bit better.

Are there any CouchDB developers out there who can comment? You can find me on the #couchdb freenode channel as well.

deleting users

If you’ve had trouble posting on my blog since I opened up comments, you should be able to do so now. I’ve deleted all registered users – so, if you were registered before, now you shouldn’t be forced to log in just to comment.

thing-a-day #10: choc chip cookies

Still burned out on music. So I made chocolate chip cookies, from this NY Times recipe. I had to make a few substitutions because of what I had on hand:

  • Whole wheat flour instead of regular flour (still used the cake flour)
  • Dark brown sugar instead of light brown sugar
  • President’s Choice chocolate chips instead of the gourmet ones suggested