PayPal X Platform:
- Now that @PayPalX Digital Goods is out of beta http://bit.ly/gd5JZm here's a refresher on how to use it: http://bit.ly/hqPf1E #
- New @PayPalX DevTalk newsletter http://bit.ly/i2x3z0 includes free @OReillyMedia epayments report http://bit.ly/fh6ysK (login required) #
- Fav line from this new @PayPalX ecommerce article http://bit.ly/hRBoRg "If you're paying for things upfront nowadays, you're doing it wrong" #
Big data:
- Looking for a succinct description of the Big Data harvest and analysis process? Here you go: http://bit.ly/esUeQJ by @hmason on dataists #
- Nice report on the final day of #StratConf http://bit.ly/dSSmim again from @JanWillemTulp #BigData #
- Working through the #StrataConf Data Bootcamp http://bit.ly/fceOVi asap (ZIP of materials is at http://bit.ly/campyslides) #
- More #StrataConf recap http://bit.ly/dHaAah including the Data Bootcamp #
- An oldie but a goodie: Sensors smash social networks in creating Big Data http://bit.ly/ebunOV (the flight data numbers are incredible) #
- Congratulations to Digital Reasoning @dreasoning on their public beta of the Synthesys Platform http://bit.ly/hlxfed at http://bit.ly/gu5hdj #
- NPR discussion with Bitly dataist @hmason http://n.pr/hFd06e delves into divining emergent props from data, plus what happens if Bitly dies #

Wireless and mobility (never would have thought Nokia would pick Microsoft):
- Please let Nokia's announcement tomorrow *not* be that they're going with Windoze Phone http://tcrn.ch/fVPbxS #
- PC mag got the key points of Nokia's Microsoft deal spot-on. Big winner: MS. One of the big losers: Nokia
http://bit.ly/fA4ZeK # - More on the Nokia-Microsoft two turkey marriage http://bit.ly/g2hPbd & NOK heads are already being replaced by Softies http://tcrn.ch/ftjyEm #
- The End of the Nokia Raj http://bit.ly/gIF2TS by @om gets the key details, including the importance of Indian and Chinese hardware, spot-on #
- Mobile World Congress trends to watch http://bit.ly/hMLa1p via @gigaom (mobile payments is a big one, @PayPalX fans) #
APIs and web development:
- I'm giving ActivePython community edition http://bit.ly/ezmHTv a try after "fun" with Cygwin Python #
- Investigating BeautifulSoup for HTML scraping in Python http://bit.ly/hctlSS on the recommendation of @ptwobrussell #
- "Paul's Python Pearls" @OReillyMedia webcast recording is now available at http://bit.ly/epX0c4 (12min preview at http://bit.ly/fIWeeG ) #
- Git in 5 minutes http://bit.ly/fPrYIp has been useful in getting my first project up on github http://bit.ly/hMYMRS #
- My github feed: http://bit.ly/hclz8Y (includes commits, new follows, etc.) #
- Using PHP, YQL, and RSS to add living data to your site http://bit.ly/e9AOG1 by my friend Peter Mancini @nectarineimp writing for @wpmods #
- I am *loving* the elegance of @CoffeeScript code vs JavaScript http://bit.ly/i1wvmL (worth trying Node just to fire up 'coffee') #
- Understanding prototypal inheritance in JavaScript http://bit.ly/hNE3ht #
- Google's new 2-step sign-in security http://bit.ly/gxvOlP; I wonder if they've managed to keep it simple or not #
- I'm looking forward to seeing more details on @RedDirtRubyConf posted Monday http://bit.ly/hS82ev #
- Research on Scala http://bit.ly/gkPUMT and Lift http://bit.ly/e4czbK turned up this Manning sample chapter http://bit.ly/fR12uj #
- I'm also looking forward to this SproutCore webcast http://bit.ly/fLkjEW to be hosted by @OReillyMedia on February 22nd #
Personal things:
- Trying out the LinkedIn Labs "Resume Builder". What do you think? http://bit.ly/g3VGGe #
- Contact information from the top of my resume, available in HTML and PDF formats http://bit.ly/bundles/billday/1j http://twitpic.com/3ydcjb #
- The people of Egypt have their liberty. I pray they see this through to protecting it with a just democractic government. #Jan25 #
- Great @FortuneMagazine piece on how @ConanOBrien resurrected himself via @TeamCoco and other #SocialMedia http://bit.ly/ibGjyN #
Running:
- Ultramarathon training for beginners http://bit.ly/eP1YhV via @dailymile #
- Ran 5.01 miles in 1 hour and 2 mins and 25 secs and felt great. Slow, slippery run with family as the snow starts to… http://bit.ly/gCw8Td #
I’ve been using a lot of RESTful web APIs of late.
I’m writing about them for the PayPal X DevZone blog (see my recent MoSoLo posts) and testing and documenting them in other work. I’ve also been doing some RSS feed data harvesting and filtering which, although it’s not technically RESTful work, uses a lot of the same skills and technologies to get things done.
If you’re doing any web development these days, I suspect you’re elbow deep in RESTlike and RESTful stuff, too.
Which leads me to this question: What tools make your RESTful development more fun and efficient? I have a few favorites I’d like to share.
My general approach when encountering a new API is:
- Learn about the resources available through the API (time to read the documentation, source, and example code)
- Prototype API calls, using an API console where possible; this can often be done in parallel with step 1 to speed things up
- Once I have reasonably sorted out calls, I copy them into my scripts and programs and test to make sure everything’s good-to-go
Example: I recently wrote about the Facebook Graph API.
Read the complete post on the PayPal X Developer Network to learn how I got up to speed using Facebook resources, an Apigee console, and good old command line cURL.
Strata conference keynotes are being livecast. Watch the latest below. [UPDATE: Now that the conference has ended, O’Reilly is providing recordings of the keynotes via the stream embedded below.]
You can also follow along via Twitter with O’Reilly Media’s @strataconf and hashtag #StrataConf.
I’ve been contributing to the PayPal X Developer Network for a little over six months now, and it’s been a lot of fun. I love exploring the PayPal X Platform with you, and in particular focusing on the parts of the platform you find intriguing and most useful.
Near the end of last year, I started wondering which specific topics you were most interested in. How was our DevZone content doing with Developer Network community members? What were the topics you as a group were saying you most wanted to read about? And of those topics, which ones were members actually reading, i.e. what was getting the hits?
I decided to do some analysis of my own blog posts and articles. I collected hit statistics for my 2010 content and sorted it by date, hits, and hits broken out by type (article or blog post). I used that data to write up a summary of my findings. I also surveyed readers on their preferred application development language(s) (click here for the results) and which third party APIs matter the most to them. I am using all of this data to work out what to write for the DevZone in the coming months. One major area of focus that’s clearly emerged: Mobile + social + local APIs, technology, and application development which I’m now collectively referring to and tagging as “MoSoLo“.
As I worked through my own content from 2010, I also started thinking about how I might generalize my analysis to include all of the DevZone content from every contributor. How would I collect the data? What could be automated? And most importantly, what insight could be gained from the exercise?
This article is the result of my investigations into gathering up the pertinent DevZone content. A follow-on article will explore the data to summarize each logical topic and highlight what was learned from the analysis.
What shall we harvest?
I wanted to analyze all of the DevZone blog posts, articles, and book excerpts from the “Blog” and “Documents” RSS feeds. If for some reason those feeds wouldn’t provide me with everything I needed, I decided I would then mine the HTML page versions, in effect the “deep” pages, linked to from the DevZone Blog and Documents pages.
I gathered together a list of all of the content locations. In addition to the DevZone links, I also included blog links for each of the four regular DevZone contributors (Matthew Russell, Travis Robertson, Ethan Winograd, and myself). These individual blog links contain posts made by each of us before the new, all-in-one DevZone feeds were created.
Here then is a table of the HTML pages for each of the six areas to be harvested as well as their RSS feed links.
| Source content homepage | Feed location | Number of items as of 26 January 2010 |
|---|---|---|
| DevZone blog posts | RSS | 171 posts |
| DevZone articles and book excerpts | RSS | 63 articles and excerpts |
| Matthew Russell’s blog posts | RSS | 13 posts from before the cutover to the common DevZone blog feed |
| Travis Robertson’s posts | RSS | 11 posts pre-cutover |
| Ethan Winograd’s posts | RSS | 3 posts pre-cutover |
| My posts | RSS | 14 posts pre-cutover |
| TOTAL: | 275 items |
Note that connections to the Developer Network server, and thus to both the pages and the feeds, are secured via HTTPS. This will be important later.
The plan
After gathering links to the content, the next thing I needed to do was figure out how to collect the data together into an analyzable form.
My first inclination was to sort out a mechanism that would allow me to plug in the RSS feeds for all six content sources, pull down all the items from each, sort them by their publication date, and then access and manipulate the resulting stream for analysis. I would pick the most promising tool I could find and then see if it could be used to successfully access and analyze the data. I had very limited time to sort out an automated mechanism, so if my top choice or two didn’t work out straight away, I would fall back to a spreadsheet as my backup plan. Falling back was not desirable from an efficiency and automation standpoint, but I knew it would work if everything else came up short.
My search turned up several dead-ends and then a couple of promising automation possibilities. My plan of attack became:
- Use Yahoo Pipes to combine the six RSS feeds into one and then operate on their contents as needed.
- If Pipes wasn’t able to handle the task by itself, try using Yahoo! Query Language (YQL).
- If Pipes and YQL failed, manually enter data from the article, book excerpt, and blog post pages for each item into a Google Docs spreadsheet for sorting and analysis (this is the same procedure I’d used previously for my own content).
Want to learn how to implement this? Read the full article on the PayPal X Developer Network (click here).
My PayPal X DevZone writing:
- My @PayPalX tech article "Developing and Deploying PayPal Apps" http://bit.ly/hBxzVz provides code level details to start #MoSoLo #
- New @PayPalX DevZone post: The Facebook APIs, part 2: Social plugins and the power of the Like button http://bit.ly/fJhOfM #MoSoLo #
Wireless and mobility:
- Canadian perspective on the mobile wallet http://bit.ly/glPrf6 which we've written so much about on @PayPalX DevZone http://bit.ly/feU6DJ #
- More on the mobile wallet: Is the end of credit cards near? http://bit.ly/gbZiix from @CNNMoney (answer: http://bit.ly/hNv74S) #
- The @TechCrunch take http://tcrn.ch/ePStz2 on Bloomberg's Apple NFC piece http://bloom.bg/fPKYY1 is TC usual: Cunning but snippy @PayPalX #
- Developers confirm it's the year of the tablet http://bit.ly/dQpd66 via @gigaom (source Appcelerator IDC survey http://bit.ly/h0xbTi) #
- Appcelerator also reports "a dramatic increase in the integration of geo-location, social, and cloud-connectivity services" #MoSoLo #
- Appcelerator survey also found mobile developers prefer @PayPalX over Apple and Google payment options http://bit.ly/evl2wg #MoSoLo #
- Facebook has officially drunk the mobile+social+local Koolaid http://tcrn.ch/dXNFq9 (they've even using @PayPal's words) #MoSoLo #
- "Taylor emphasized that the convergence of mobile, local and social is the most interesting locus of growth" http://tcrn.ch/dXNFq9 #MoSoLo #
- Will mobile undo Facebook's dominance as OS & the Net undid Microsoft and social is undoing Google? http://read.bi/gmZXTO #MoSoLo #
- Facebook's latest acquisition (Rel8tion) has location-based mobile ads written all over it http://rww.to/ecQE6h #MoSoLo #
Big data:
- "Finding trending topics using Google Books n-grams data and Apache Hive on Elastic MapReduce" http://bit.ly/fgqcD3 from @awscloud #
- Amazon has further ratcheted up the value of their @AWSCloud services with EC2 Spot Instances http://bit.ly/gBYfwT (very nice) #
- Twitter lists to follow about Cloud Computing http://rww.to/gg7SGk #
Web development and site stuff:
- It is refreshing to see such a straightforward intro as "The Art Of Scripting HTTP Requests Using Curl" http://bit.ly/g6bM4h #
- The HREF replacement using %c and %| is worth the price of Texter by itself (free!) http://lifehac.kr/f1cPx3 #
- Handy tip: How to generate a PDF with PHP http://oreil.ly/hBRIdM from @OReillyMedia #
- LinkedIn Companies Strategic Toolkit http://bit.ly/eWBaJp from Traction http://bit.ly/g2fZk0 via CEO @adamkleinberg #
- Is innovative Java dead? http://rww.to/e6YtRg (Oracle worries me) #
Personal things:
- Recent @GeekMomBlog posts from my wife: Alfred the Hedgehog http://bit.ly/fSwbj8 and a call for bakers http://bit.ly/ieUyqa #
- More @GeekMomBlog Star Wars projects: Modeling chocolate Chewie cake http://bit.ly/hBFBwL and build your own R2-D2 http://bit.ly/fmJM4p #
- Interesting way to parse the State of the Union address http://bit.ly/hHaUkj (charts linked to the pertinent parts of the speech video) #
- I don't understand how the Egyptian government thought shutting off the pipe would work http://bit.ly/gcRf1g (all it did was expose reality) #
Running:
- Ran 11.01 miles in 1 hour and 58 mins and felt good. Windchill in the teens. Slowed down in mile 3 to watch five whi… http://bit.ly/fKXUoF #
- Two of my passions, running and #MoSoLo meet in this @radar interview http://oreil.ly/fU55yo with the CEO of RunKeeper @jjacobs22 #
- Great news, runners: @RunKeeper will remain free http://bit.ly/gcjagI #








