Belgian Perl Workshop 2011, review

We (Jean Benoit and me – Strasbourg.pm) were on the highway from Luxembourg to Brussels, there was a light fog and we weren’t completely awake when we saw a beautifull white horse galloping to us in the wrong way. Yeah, welcome to Belgium :). This edition of the Belgian Perl Workshop (thanks to Belgium mongers for oganization and to Dirk De Nijs and Module Builder for sponsoring it) was the smallest Perl event I ever attended (1 day, 1 track, something like 20 attendees) and it really pleased me: we were few (mostly dutch and french people) but all of us were experimented mongers and all the talks (including mine, I hope) were instructive (the topics, the speakers, also the interactions between them and the audience).

Note for myself: tux told us about surprising performances of SQLite and CDB during his talk on tie key/value store in a hash. I hope he’ll give us feedbacks about it.

The BPW site don’t mention the lightning talks which were interesting too:

  • Liz about perl oneliners and strictness
  • Laurent about the french perl events: there would be plenty of them in 2012, including QA Hackathlon. If you’re interested in sponsoring, please let me know.
  • Claudio about you! If you can help the Perl community at FOSDEM, please contact him or Gabor
  • Closing talk by Ecocode

The end of the event was a bit short: everyone has to go back home and we don’t even had time for a beer event. That’s sad but as FOSDEM will stand at Brusel too, I guess I’ll see you in february.

Comments »

Before YAPC::Eu (ye11 report part 1)

I attended all the Perl/OSDC events since 2009. I had fallen in love with the Perl community from the start. They are smart, open minded, helpful and fun. Why had I waited so long to meet them IRL?! When Linkfluence sponsored me to attend to my first YAPC::EU and give a talk on Perlude, my CTO warned me: it’s even better than FPW because you realize how big and commited this community is. I was curious …

Actually: it’s awesome! Talking with guys that had inspired me for so many years (Larry Wall, Damian Conway, ...) and meeting all those clever, anonymous people comming from all over europe to share ideas and give feedback was an eye opening experience. Thanks, Linkfluence, thanks! I really mean it!

I read a lot of reviews since and everything is true :) Riga is a very beautiful place. The weather was wonderful and latvians are cheerful and outgoing. Plus: the YAPC team did some very good work. As result, the attendees and speakers had nothing to worry about: everything was simple and pleasant. Thanks, guys!

My first “class” was the Speaker training. Before the conference, I was skeptical about its value but the presentation blew me away and I now think that every speaker must attend this course at any OSS conference. I heard a lot of good advice about preparing and giving a workshop and Damian led us to more meta, such as “why am I really giving this talk?”, “who is my audience?” and other questions I had never asked myself despite the enormous value confered by answering them. So yes, Damian, I try to “be a Lion” :)

Then came ./yapc—start and we learned that Frankfürt will host YAPC::EU 2012 (this homepage wouldn’t be so cute if Jean wasn’t there). Congrats, you earned it. (How about YAPC::EU at Strasbourg ‘20?).

Afterwards, we heard Larry Wall’s keynote about post neo mutable aristotic modernism or something. I had to admit I had trouble making sence of this cerebral journey from Riga to the Bremen town musicians, to Stalin’s conception of modernism, to the risk of duck attacks, and finally to the future of Perl. It seems like Larry talked about images and ideas, their evolution and relationship with their origins, and the way they evoke the past.

While Perl has kept a strong spirit and community, it has changed a lot since 5.10. We have to spread these changes as well as Perl’s old strengths.

PS: I’m about to leave linkfluence HQ to join Belgian Perl Workshop 2011, See you tomorrow.

Comments »

YAPC::Eu survey and next events

The results of YAPC::EU::2011, Riga, are online. This document gives some good indications about the shape of the Perl community, how active we are about attending perl events and how we are all getting old ;). In France we have a vibrant, though still somewhat obscure, community and it’s all our fault: we haven’t posted anything in french about YAPC::Eu, the new perl versions, the metacpan or even the upcoming events.

However you will see french people at every one of those events and the french mongeurs are heavily involved in the organisation and sponsoring of the QA hackathlons.

Thanks to linkfluence and les mongueurs de perl, I’ll attend to all those events. So guys, if you’re one of those who missed the perlude talk at YAPC, see you soon! I’ll also write a linux french page.

Comments »

Do you know Josette?

Do you know Josette? You already seen her if you attended to an OSS conference last years, somewhere in Europe. You even was really close to her for few minutes. But you don’t remember, do you? You stared with envy at the treasures she brought to you at all those events with an incredible, unbreakable cheerfulness: the stacks of books of the O’reilly UK stand.

This cheerfulness is one of the thing I really want to meet again when I attend a new event, not only from Josette but all of you. Josette is not only sharing smiles and time but also our convictions and energy. We’re used to say that Perl is our community, and definitly: Josette is one of us. I would like to thank her for that.

Talking about community, we chatted about it at Riga. Our strenghts and weaknesses, what can we do to improve it? I told her I’m really proud to have joined a company involved in the community not only by understanding this is a long term investment but because the board share our convictions of what the computer world must be. So she connected me to Mark Keating, who made big efforts to market perl in UK and she promised me a printed copy of the art of community. She gave me the copy at OSDC.fr this week-end. In the name of Linkfluence at all, I would like to thank her and O’reilly UK for this present.

This makes me remember I never finished my YAPC review. I will soon!

Comments »

Our summer schedule (mainly perl events)

June, for the first time of French Perl Workshop’10 history (afaik), we ran out of slots! Not only this edition was bigger, but the technical level was good and our old fellowship saw lot of new, very interesting people. As platinium sponsor, members of organization team and talkers (we gave 6 talks in all), Linkfluence is very happy about this success. We really hope it will encourage more companies to help the community. As our parisian spot became definitely too small, the next edition will take place at Strasbourg (and i heard about Marseille’13 which could be nice).

July, same happyness at RMLL run by mongers from Linkfluence, Biblibre a staff member and a student from Université de Strasbourg. There was a long thread on the mongers mailing list about this unexpected perl popularity. One of my personnal conclusion is we must ease the access of the community (let run some new .pm groups!).

The summer isn’t over: Nils, Stéphane and Guilhem just came back from the 5th International AAAI Conference on Weblogs and Social Mediaa (ICWSM) at Barcelona.

August, Marc will attend to the YAPC::Eu to give one (perhaps 2) talk.

Comments »

Goodbye, Franck (and good luck!)

“The next thing i’ll migrate is my ass to USA”, Franck Cuny said monitoring his last migration of huge amount of data for us. Linkfluence is getting bigger and bigger but sometimes, someone leaves. Hired by SAY:, Franckie goes to Hollywood (or something), leaving us today.

We’ll miss him! He took a very important place at Linkfluence (as a friend, a software architect, a developper and community manager) as well as in the french community in the whole (fruitfull programmer, very active member of the french mongers board and brilliant, accurate speaker). We clearly can say that we are sending our best spy to SF :)

cya at YAPC, lumberjaph.

Comments »

GitHub Poster

Here we are again, with a new poster of the various GitHub communities!

A few weeks ago franck started to work a new exploration of the GitHub communities, and he published the results today

You can use StarGit to browse your network, and read Alexis’ article

Unlike last year, we won’t try to sell printed poster, as it’s too much work. However, feel free to download and print it yourself. The size of the poster is A1, so you should have no difficulties to find a shop that will print it for you.

Last year, while creating the first poster, most of the work had been done manually by Antonin, our graphist. When the graph was ready in Gephi, we did an export in SVG that we imported into illustrator, and then Antonin did all the work to create the awesome poster.

This was the first time we did this kind of work, and he spent a huge amount of time on this one. As we wanted to produce more poster of this kind for our clients, Antonin looked at some solutions to automate the process. He found a nice solution with scriptographer.

This let you create some scripts in JavaScript to manipulate your illustration.

Now we use the following workflow:

  1. export your graph from Gephi in SVG
  2. import the graph in illustrator
  3. copy the SVG into a new document with the size you want
  4. execute all the scripts that will:
    1. extract informations about communities
    2. add title / logos
    3. add the miniatures for each communities
    4. let you put a label for each communities and change the colors if needed

This way we’re able to produce within a few minutes a poster.

Comments »

French Perl Workshop 2011 conference coming up soon!

The French Perl Workshop 2011 Conference will take place in Paris, France on June 24 and 25. It is aimed at professionals and enthusiasts alike, and as each year will host plenty of interesting talks about new Perl features and libraries, best practices, and also more general programmer-oriented topics. As always, the whole event will be free.

FPW 2011

Linkfluence has been a sponsor of FPW for three years now, and most of the dev team will attend the conference, as a speaker or in the public. It is a great place to meet and talk, and this year there will be a conference dinner each evening, so there will be plenty of time!

You can subscribe to the conference, check out what talks we have planned and submit your own talks proposals, there is still time left. It makes a great sounding board for your talks if you are interested in speaking at YAPC::Europe this year in Riga from 15 to 17 August 2011.

Comments »

Linkfluence big picture

Since we explained in our first post why we switched from CouchDB to Riak, several readers asked us about what we are doing at Linkfluence, what kind of tools we use and what’s our global architecture.

We harvest social data on the web

As explained on our main website, Linkfluence is a research company, specialized in the social web, which sells marketing and opinion research studies to advertisers and communication agencies.

Unlike traditionnal research institutes which work with polls and classical surveys, at Linkfluence we focus on spontaneous speech on the social web (i.e. blogs, forums, social networks and so on).

Our main goal at Linkfluence labs is to harvest data from the social web. To make this possible, we have developed during the last four years several systems to retrieve data on the web and extract intelligence from it.

Main Architecture

To achieve this we have designed a modular system composed of workers communicating with each other through a centralized message queue system. We have workers which:

  • fetch syndication feeds and store the extracted metadata to Riak;
  • resolve and canonize URLs;
  • fetch and store the raw HTML pages inside Riak;
  • extract qualified data and links from the raw page harvested and store them inside Riak (with the original document) and inside MongoDB (for graph manipulation);
  • create a mask in order to extract qualified data from the HTML;
  • etc.

PostgreSQL is our database of choice. Each PostreSQL database is wrapped inside a Catalyst instance which provides RESTful APIs. Workers are written in Perl too, and they use SPORE to connect to the various APIs. All the data fetched and extracted is stored in Riak.

We make platforms and visualizations

Ultimately, we make country and/or language centric engines.

We have a specialized worker which pushes specific data into dedicated business unit:

  • a Solr index which stores textual posts and external links;
  • a MongoDB which stores the graph of internal links;
  • a Redis which allows to compute dynamic indicators for ranking.

All these tools are driven by Perl code which handles all data and provides front APIs.

These APIs are comsumed by our frontend CMS and Flash expert visualizations which are then used by researchers to write their studies.

Today, we cover 6 countries (France, Germany, USA, UK, Italy and the Netherlands) which represent over 60k sources (websites) and we have more than one year and a half of archive available. We limit ourselves to 60k sources for some specific reasons (they are not technical).

Evolution

I have described our current system architecture. Today, when we want to add data from new source types, we just have to write new workers to retrieve their content and insert it in the system. In the same way, we have built other business modules to manage twitter data using MongoDB and ElasticSearch. Why we are considering switching from Solr to ElasticSearch for this purpose will be the topic of another post. But I can say, among other problems, that Twitter can’t be qualified in a country centric method, which makes it harder to design a good way to shard Solr instances. The way ElasticSearch scales matches better for now the way in which we plug Twitter to our system.

Comments »

Links review

10 find command in unix: I don’t know for you, but I always forget how find works. Each time I need to do something, I need to dig inside the man and read the various examples. This articles show some basic tips for find. Beware: The minus character in the post are not true ASCII characters, just type your own. In addition, you can efficiently replace -print | xargs rm -f by a simple -delete option.

The Google Vortex: The last testimony of a jedi going to join the dark side of the force ;)

How Grooveshark Uses Gearman: gearman is a job queue system written by Brad Fitzpatrick during his time at livejournal. This is an awesome piece of software that let you write synchronous and asynchronous workers. In this article, Jay Paroline show how Grooveshark use Gearman to distribute jobs.

parisdevop: This is the site for the devops group based in Paris. You should find reports from the various meetings and some articles. Each wednesday, a links review will be published. Warning! This is in french.

cpan syntax highlighting: search.cpan.org now does optional syntax highlighting! And color is the way to go nowadays, right?

Comments »