Our First Installment of 'Biebs Rants on X': Biebs Rants on Phone Sounds

So ealier yesterday morning in haze, after spending all night coding, I decided it would be a good idea to start a video series the AWSome Hackers are calling 'Biebs Rants on X.'

The back story behind this is after living with Jesse Beyroutey (or JB, or Biebs) for only a few weeks we've started to learn that there are several things about the world that just annoy him. Not that he's a bad guy, we love him and he's super nice. We just notice that there are certain things that he wishes he could change. There are many more, but we decided to start with Phone Sounds because - appropriately - he started making choking gestures towards the phone this morning after it rang during a meeting of ours.

Hope you enjoy!

Another reason why Posterous and Hacker News are so awesome - great community

This is something I discovered today after trying to figure out how to post code snippets onto Posterous - check it out:

http://news.ycombinator.com/item?id=366718#

 

This is Posterous co-founder Garry Tan spending all night building a feature someone on Hacker News felt like they needed. How much more customer-centric can you get? How much scrappier can you get?? This willingness to listen to fast response time is a testament to Garry Tan and the Posterous team.

Not only is this great example of the benefits of caring about users, but it's also an example of how Hacker News has really developed a community of hacker-users that love to try products, give useful feedback, and promote change.

The general lesson though is just pay attention to your users, care about them, and execute! I'm thoroughly impressed by the response time to this posting (as was 'sdurkin' it seems) and I feel so was the rest of the hacker community. This is major kudos to Posterous or whatever company is able to pull something off like this. Little things like actually listening to responding fast makes a great impression on users and the whole community. The goal is to get people to fall in love with the product and the visionaries behind it. I willingly admit that discovering this old post did both for me in regards to Posterous.

Now it's our turn to emulate this behavior.

Python Web-page Scraping - Installing lxml and Beautiful Soup

So I've always used RegEx to scrape all my data. In fact, it can get pretty tough/tedious for a noob like me. I've been able to use it, but it's just a hassle. And until a few days ago, I thought this was my only route.

Fortunately for me, a few super-smart-engineer-entrepreneur friends (Noah Ready-Campbell and Calvin Young) told me about lxml and Beautiful Soup. They said it was a little tricky to install, but I didn't believe them... I tried it out for myself and actually had a lot of trouble getting it going. Eventually I stumbled upon something that made it pretty easy for me, but I'm hoping to turn that around and make it even easier for you to get.

So here it goes (disclosure: this worked on my Ubuntu EC2 emi and Ubuntu home machine):

How To Install lxml:

UPDATE: 

Try:

sudo apt-get install python-lxml python-beautifulsoup

Thanks to lamby of HN for this! I just tried it and it worked on my new Ubuntu EC2 ami...if anyone finds out this doesn't work please report it to me/someone-actually-important!

/UPDATE

The problem people usually have is there are just a lot of dependencies and it just seems that it never works. So here is what we'll end up getting through this:

  • libxml2 - the lxml library
  • libxslt1.1 - some other library that is a dependency
  • libxml2-dev - the libxml dev header
  • libxslt1-dev - the libxslt dev header
  • python-libxml2 - python bindings for libxml2
  • python-libxslt1 - python bindings for libxslt1
  • python-dev - python dev headers
  • python-setuptools - the thing that lets you run easy_install

So here's how it should all look like:

And if some of these exact commands don't work try searching for the package or updating your package directory:

 

Boom! Should be done. If you guys are running Ubuntu and have issues with this feel free to email me: wesley.zhao@gmail.com

 

How To Install BeautifulSoup:

This is much easier and hardly needs any instruction.

First go here to find the file you want: Beautiful Soup Downloads

Then depending on your file path and download you choose this is how you do it:

 

Once you have these installed check this post out: lxml; an underappreciated web scraping library.

The post has some great examples of how easy it is to scrape with lxml and BeautifulSoup. It's practically like being able to grab CSS tags!

Again if anyone has any questions feel free to email me! I know the set-up process can be a huge pain so...yeah.

Posting Code Snippets on Blogs - avoid the hassle, use Github/Gist

I have to admit - I love Posterous, but for some reason I couldn't get code-snippets to post using the "[code]" tag or their markdown syntax. I went through a little tutorial/page on how to do it the old way (which I actually discovered something interesting I'll blog about later) and the new way they recently announced to mark code.

None of them worked! I will assume that it was my fault and it has something to do with the custom CSS I used to edit this theme.

Either way, the solution is simply use Gist.

Just head over to Gist.

Name your Title

Name your File

Choose a Language

Start typing your code, and at the end just copy the url at the top of the address bar (will look something like )

And paste the link directly into your blog (if it's a posterous, if not then copy the embed code).

Here's how I found out how to post Gist to Posterous.

How to automate slugs (and other fields) in Django Models

Going to post this really quick because I have to keep coding..

Here is something I learned really quick and for my sake I want to record this down, and I thought it might be usefull to ya'll as well.

If you want to automate slug creation for your Model here is how you would do it:

That is how you do basic slugify-ing. Basically you have to make sure you define the slug field, but set a default so nothing is required of you when you first instantiate the Model. Then override the Model's default save method to slugify your title before saving. I'm still trying to figure out how to make a fail-safe so that all slugs are unique and nothing clashes...

Here is the offiical django documentaion on how to override saving.

Something to learn about Cron jobs

Other than their odd/awkward naming, cron jobs can be pretty useful.

From what I know, a cron job is just a job you can set in linux to run in the background off a timer. So you can set it to run on a specific month, day, time, or whatever, or you can set it to run every day, hour, minute, second, etc..

For me, I'm currently using a cron job to tell me (every hour) if there are any new listings on Craigslist sublets in Palo Alto.

Just a little background on the project, I'm running it on my Ubuntu AWS EC2 instance, using lxml/BeautifulSoup libraries for Python, and the Twilio API (and its Python wrapper) for this.

Setting up the Python file to scrape and send texts if it found anything new fortunately did not take too long - just one/two hours - but setting up the cron job was infuriating.

First I couldn't exactly get the timing syntax right, then I tried using a .sh file to make sure it would execute, then I had to make it executable, then I tried just calling a Python file, and it just kept going on and on and on.

But some of the things I learned along the way:

  • * */1 * * * is the same thing as * * * * * which tells the cron job to run every minute...
  • 0 * * * * and @hourly actually get it to run hourly
  • Make sure to make the file executable: chmod +x /path/to/file
  • Also make sure to tell the terminal which intepreter to run: #!/path/to/interpreter
  • Make sure to use the full path to the file you are executing.
  • And ask smarter people for help more often

There is probably more I learned - but that is all I remember at 4:30 in the morning. The biggest fix that I cannot believe I messed up was just making sure to use the full path to the file you are executing to make sure the cron job can find it.

So here was my final cron job code:

      0 * * * * /home/ubuntu/update_craigslist.py

And this was at the first line of my update_craigslist.py file:
#!/usr/bin/python

Anyways - I plan on putting a lot more in depth (this was not in depth at all, I know) tutorials up as I continue to build and learn. In addition, there is a lot of stuff I've learned so far that I need to put up as well.

I've Switched!

Goodbye WordPress.

Hello Posterous.

Some of my old posts which have been imported over may have some formatting issues (e.g. it looks like a lot of new-lines have been removed or something). It looks really ugly I know! So don't look at my old posts!

Maybe I'll get around to reformatting them...maybe.

And the AWSome Hackers' Video Diaries Begin!

 

So the video diaries of the AWSome Hackers begin... If you are wondering about the odd spelling of 'AWSome' please feel free to ask :). (Hint: it has something to do with the correct pronunciation of Amazon Web Services). So the plan is to keep anyone interested (so basically that just means our moms and dads) updated on our life here trying to make it big in Silicon Valley. We want to be as 'daily' as we can with our video updates and to really keep everyone in the loop. Of course we will be open about the goings-on of our startup and our ideas as we feel an open policy is how to best keep this vibrant startup community growing. Here is our first video that I embedded and here is a link to our Youtube Channel: AWSome Hackers. Sorry if we seem a little tired - we had a long day and it's getting close to our bed time. We'll try to make sure we're a little more energetic and engaging in our future videos! Also we have some blasts from the past we want to upload too including our trip down for the Y Combinator Interview and our trek as TechStars New York City finalists. Enjoy!

Writing my first Python/Django app - already proved I'm fallible

Sorry guys (the one or two of you, or just me, that still reads my blog) for being incognito these past couple weeks. Finals hit then got home and enjoyed myself a bit with friends and my girl. I've been working on getting up to speed on learning a python web framework for a while and I chose Django because I heard there was a lot of support out there and it seems to be taking the lead because it is both high-level enough to do cool things easily and editable enough that it gives you low-level access (unlike Ruby?). So I've been stuck on this hiccup for setting up the Django environment which (I later found out, thanks to my Dad) involved Cygwin using a unix file system and my Python still only recognizing Windows file system stuff. I finally figured it out after slaving hours and hours scouring the web for answers and asking my dad for help (he actually ended up patching it up temporarily). I was so happy that I could move on to the actual learning of Django vs the set up. Then I left it, got home after a bit, and started using my home desktop (I had set up Django on my laptop previously). I was about to write a tutorial for setting up Django on Windows when I got curious about... could I just have set it up super easily, and avoiding that whole filesystem incompatibility nonsense if I just used the Windows command prompt to set up Django... So I started just using Windows command prompt and it worked (so far)! WOW. Wasted time slaving over trying to figure it out using Cygwin on my laptop...lesson learned though. Avoid tunnel vision. I will continue my Django tutorial on both my laptop and my desktop (I figure it will be helpful anyways to do it twice...and engrain it into my mind) and if both work out fine then I will simply write my tutorial on how to set up Django with windows using Windows command prompt and avoiding the mess I fought with!! Unless someone requests for me to write the version for using Cygwin as well...would be more than happy to. Anyways...goodluck to me on getting this all working. And as soon as I figure out if both work/not I will write that tutorial!