I have already introduced you to this application, and i already told you about the porting process to Zend Framework 2 but that's not all. Since i finished the first version which only worked for Greek documents, i always wanted to add support for the English language and extend it further more.

So after a few weeks of work and research here it is with a brand new name, Sum+my, which stands for Summarization Methodology Yardstick.

summy_home.png summy_admin_documents.png summy_admin_terms.png  

 

Changelog v2.0

  • Brand New Appearance based on Twitter Bootstrap
  • Ported Application to Zend Framework 2.0
  • Added Support for English Language
  • Added Stemmer Test Page
  • Fixed typos in the Greek Stemmer, improved accuracy a lot!
  • Summaries are now cached, to avoid double posting, the links are by no means permanent!

 summy_admin_home.png summy_stemmer_en.png summy_changelog.png

It's probably the work i am most proud because it's so out of my element, so give it a try you might  actually find it a great tool for everyday tasks and let me know if you have any ideas to improve it further or why not start adding support for more languages. For that you might also want to check the docs of the Sum+my

 

fmsdb.png

FM Social scouting aka codename FMSDB was a project i did almost 18 months ago for fmscout.com but never went live for legal reasons. All the difficulties and delays are finally over and it will go public hopefully this week. FMSDB is practically a database for football (soccer) players and clubs based on the video game Football Manager. Except all the personal information and game statistics every profile page has comments and ratings support hence Social Scouting and photo galleries as well. Until the project goes officially live that's all i can say but here is a couple of screenshots. The project is live you can check it online www.fmscout.com/players.html

 fmsdb_player_search.png fmsdb_player_profile.png

The application is build as a module for Cotonti cms, it involved mostly php/mysql code and data mining which is something i always like. Originally image galleries were hosted by imageshack but since last year they changed their policy and galleries had to be rewritten from scratch, talk about snafu right ? that's one piece of code i won't be reusing any time soon. ;p The past few weeks we also did the beta testing and small fine-tuning a year overdue.

Day number two of porting this application to zend framework 2. I only had a couple of hours today but still i have a couple of things i actually want to talk about. I keep encountering things that are not that much different from ZF1 like Zend\Navigation or Zend\Paginator. I was quite surprised i got them working in seconds.

In the app i have a paginated list of documents, and below is how the code changed:

Zend Framework 1

$db = Zend_Db_Table_Abstract::getDefaultAdapter();
$select = $db->select()->from('document', $orders)->order(array("$order $way"));
$paginator = Zend_Paginator::factory($select);
$paginator->setDefaultItemCountPerPage($perpage);
$paginator->setCurrentPageNumber($this->_getParam('page'));

Zend Framework 2

$db = $this->getServiceLocator()->get('db');
$select = new Select('document');
$select->columns($orders)->order("$order $way");
$paginator = new Paginator(new DbSelect($select, $db));
$paginator->setDefaultItemCountPerPage($perpage);
$paginator->setCurrentPageNumber($page);

 

 Which if you exclude all the Use Statements you have to add on top it's not bad, i actually think it pretty much the same. It would be nice to be able to get the Select object from the adapter but it's not a big deal. The second thing of the day was how actively the ZF2 is developed. I found a couple bugs today working with v2.0.3. The most important was with the Zend\Mvc\Router\Http\Query which was adding the controller, action and namespace parameters to the query string, instead of only adding the extra parameters it was receiving. But guess what download the latest version from github and voila bugs are gone. (Lithium are you there... ;p). That really made my day, i make it a habit not following popular trends or rooting for the underdogs, see my history. I started learning php with LDU cms probably the most unkown cms of all and now choosing Lithium for Komposta Core instead of one of the most popular frameworks like Zend or Symphone or CakePHP for pete sake. So it felt great to go from the disappointing moment  of finding a bug to see that i was fixed in a minute. People are whining how much ZF2 changed and that it got a bit Java-ish but today at least, in my mind ZF2 will succeed because of the active developers and contributors.

zf2.pngI 've been following zf2 from the very first beta versions mainly reading presentations, tutorials and occasionally i browsed the code especially the skeleton application to see how the project progressed. So far i wasn't very impressed with it, the common view as i understand it is that it's not quite there yet and that it needs to be refined more before people really start to jump in and port their old applications to the new framework.

Today thought i started porting my BA thesis application from ZF1 to ZF2 as i plan to extend it  and add new features. The application consists of 5 controllers, a few plugins/helpers and the main engine that produces automatic summarization for Greek language which is a bunch of filters also written in ZF1. Beside that there are Users and Administration systems in place to easily manage the internal data.

I actually managed to accomplice more than i thought for day one. I got the front-end controllers and the summarization engine working with ZF2 quite easily. But still some things really bothered me. The most annoying thing of the day is the new Zend\Db it's obvious that they tried to enforce the Model part of the new MVC concept and it's not there not by a mile not when you have to inject every custom model with the TableGetAway which has to also be injected with the Db\Adapter.  

class Module
{
    // getAutoloaderConfig() and getConfig() methods here

    // Add this method:
    public function getServiceConfig()
    {
        return array(
            'factories' => array(
                'Album\Model\AlbumTable' =>  function($sm) {
                    $tableGateway = $sm->get('AlbumTableGateway');
                    $table = new AlbumTable($tableGateway);
                    return $table;
                },
                'AlbumTableGateway' => function ($sm) {
                    $dbAdapter = $sm->get('Zend\Db\Adapter\Adapter');
                    $resultSetPrototype = new ResultSet();
                    $resultSetPrototype->setArrayObjectPrototype(new Album());
                    return new TableGateway('album', $dbAdapter, null, $resultSetPrototype);
                },
            ),
        );
    }
}

I am pretty sure there is a smarter way to go that people already use in their modules but that official example is a joke. In my case there are about 4 mysql tables and in the current code base so i didn't bother with models since the system is quite predictable about future features and queries so i prefered the direct approach, here is an example of ZF1 vs ZF2 for running raw queries.

 Zend Framework 1.11

$db = Zend_Db_Table::getDefaultAdapter();
$docu = $db->fetchOne("SELECT COUNT(*) as unprocessed FROM document WHERE processed = 0");

Zend Framework 2.0.3

$db = $this->getServiceLocator()->get('db');
list($docu) = $db->query("SELECT COUNT(*) as total FROM document WHERE processed = 0")->execute()->current();

So my first day wasn't very positive. It was easy to port basic stuff and start working but on the other hand i am not seeing many advantages so far. Keep in touch as i will post daily about my experience with zf2.

It's been more than 6 months since i took a paying php job. My friend Stam from fmscout.com asked me to build a charts module for Cotonti, to show information about his site activity. It supports several areas both from core and 3rd party plugins:

  • Authors
    • Top Forum Posters
    • Top Pages Contributers
    • Top Comment Posters
    • Most Thanked Users
  • Articles
    • Most Viewed
    • Most Commented
    • Most Thanked
    • Most Downloads
  • Forum Topics
    • Most Viewed
    • Most Replied
    • Most Thanked

Every section includes cut-off date options, like since last week, last month or overall, fully customized from the modules settings. Most of the information is easily retrieved except for a few areas where extended work was needed to actually collect the data. Not really a hard job, maybe a little boring but not bad for a 6 months comeback if you exclude komposta cmf.

 

fmscoutCharts1.png fmscoutCharts2.png fmscoutCharts3.png fmscoutCharts4.png 

 

Cotonti Charts

 

fmscoutCharts5.png fmscoutCharts6.png fmscoutCharts7.png 

vBSed was the very first huge project i ever completed in php/mysql and it was for my own website 3dacc.net (r.i.p). Till this day i believe it was one of the very few proper content management system ever built for vBulletin v3.x.x before v4.0 even came out with its ugly ugly ugly cms. 3dacc.net was originally powered by ldu/seditio cms. At some point while the site was growing i decided seditio was lacking in forum features and decided to use the famous vBulletin. But while vBulletin was the best in bbcode forums apps out there it didn't have a proper cms official or not, vBSed was born.

vBSED = vBulletin + Seditio

vbsed_home.png vbsed_list_categories.png vbsed_list_pages.png vbsed_page.png

 It wasn't a poorly written bridge it was written from scratch as vbulletin product (module) and was based on all the things i believe made sedito a great cms, simplicity, easy to use, easy to extend easy to maintain. It included unlimited levels of categories, pages with different types like articles, downloads, links, a file manager and various homepage widgets.

vbsed_admin_categories.png vbsed_gpudb.png vbsed_pages_stats.png vbsed_sfs.png

 Some may not like the term proper cms but consider that at the time the best solutions out for vbulletin were a front-page mod with widgets and a fake cms that was camouflaging forums into categories and topics into articles. I think there were a few paid mods but  they weren't very popular. At a latter point a terminology section was added to vBSed which was embedded in all the site's content: forum posts and pages, linking terms back to their definitions (wiki style)

vbsed_termsdb.png vbsed_termsdb_term.png

The site was doing ok, for the last two years i kept it open without spending a dime on the dedicated server's cost, but i lost interest and web designing and development won me over. vBSed never got its public release and i closed the site 3 years ago. I wish i had released this project, who knows maybe now i wouldn't be in job searching...

State Scholarships Foundation (IKY in greek) is the official organization who manages scholarships in Greece and once again i was asked to develop a web platform to manage applications for scholarships. If you read my previous post i had already worked on a similar project for a different scholarships program, but for some reason it wasn't organized and managed my IKY but by a sub-division inside the Ministry of Education. Till this day i haven't figured out why. Greece the country of amazing and bizzare things ;p

Thankfully IKY knew how to run things and it was so much easier to work with them. The web platform beside the applications it had to also manage the evaluation process. The sytem had to manage evaluators who were independent professors from all over the world and blind assing applications to evaluators with the proper scientific background. Huge project and probably my best work from last year. Time for info and pictures

  • Powered By Zend Framework
  • PHP/MySql/Html/Css/Jquery
  • Completion Time: 6 weeks plus 8 weeks for tech support for applicants and evaluators
  • August - November 2011

ikyhome.png  ikyappbefore.png ikyapp1.png ikyapp2.png

 

The website was hosted here http://apps.gov.gr/minedu/iky_scholarships, unfortunately this year they didn't use my application. Back in February i got an email from my old partner in Greek Ministry of Education asking for help. Unfortunately i had no time to spare and my military obligations were coming up and i kinda ignored him. For some reason all the programmers in Minedu and IKY didn't do PHP. Most of them were java and asp programmers. My guess is that they didn't found someone who could manage my application and decided to either make a new platform from scratch or and most likely gave the job to a private company. They were going to do the same last year as well, their budget allowed it, but as i later learnt they couldn't find a company to finish the project in such a short time.

 

 

 ikyapp3.png ikyappafter.png ikyadminhome.png ikyadminexport.png

I am really regretting not responding to my old partner, now i am broke, unemployed and too embarrassed to call them back asking if there is any opening for a web programmer.  Hopefully something will come up soon.

 ikyadminappsmanage.png ikyadminapps.pngikyevaluator.png ikyevaluatorevaluate.png

This was the first project i competed for the Greek Ministry of Education last year. Through bilateral agreements students from Greece can apply for scholarships to study abroad and the opposite. The selection is done by the students country and they are accepted by the univercity abroad whithout interviews and cv. The project had problems from the beggining. Part of the blame is on me because of my inexperience at the time, i though i had to work with people who actually understood the problems of transferring an application form from paper to web. It turned out they were bureaucrat apes and because of them the ministry missed the deadline to send the students information and hundreds of Greek students missed the opportunity to study abroad. As far as i know they didn't attempt to run the scholarships program this year.

Anyway enough of my badmouthing. Some application info and pictures.

  1. Powered by Joomla 1.61
  2. Work involved php/mysql/html/javascript
  3. Hosted at http://apps.gov.gr/minedu/scholarships (now dead)
  4. Full Administrative Application Management
  5. Completion Time: 2 Weeks + 3 Weeks tech support during the apps period

sc1app1.jpg sc1app2.jpg sc1admin1.png sc1admin2.png

High School students in Greece are admitted to public Universities based on their final exams during their senior year. They are submitting forms with their information and a selection of schools they want to attend. Last year the ministry of education of Greece decided to simplify the applications procedure by making it web only. Two teams of programmers were formed one managed the web platform for students who finished high school in Greece and another for Greek or foreign students who finished high school abroad.

I was part of the second team which consisted of only two people. I was in charged of building the web platforms and Athanasios Rouskas was something like the project manager bringing me all the specifications and dealing with other departments delays and bureaucracy.

Below you can find screenshots from http://mixanografiko-eksoterikou.opengov.gr/ where the application is hosted.

selectSchools.png review.png massEmail.png main.png form.png adminMain.png adminApps.png 

 

The web platform was used again this year, with no updates.  The second application, which had minor changes to meet the state criteria for foreign students, was actually completed by Rouskas who had never seen php code before but he was so damn eager to learn and he learnt. That application was also used this year here http://mixanografiko-alodapon.opengov.gr  

Edit: 15/8/2013
I am actually very proud to see that both platforms are online this year again, 3 years and counting (
2011-2013)

The application was build with Zend Framework/MySql/Jquery. Below you can also find the sample documents the system creates with Zend_Pdf for the users to verify that everything went fine. 

Read about the updated version Sum+My here.

Automatic Summarization for the Greek language is the title of my undergraduate thesis i completed almost a year ago. Because of the obvious difficulties i choose to use for the first time a PHP Framework. I wanted to focus on the task at hand without worrying about the basics involving a web platform. I could have gone with a cms like joomla or drupal but at the time i found Zend Framework 1.xx.x to be so much better for the job, because of the excellent documentation and online sources.


Automatic summarization is a vague subject in the science of natural language processing. There are two main methodologies you can follow, extraction and abstraction. The second is the hardest and involves computer learning techniques (AI). Basically the machine has to learn how to produce the summaries, pretty much like a human. The extraction model (shallow) is based on maths and doesn't really create sentences from scratch, all the sentences come directly from the original document without any alternation.

For an undergraduate thesis working on the abstraction model is a little too much and honestly a semester is not enough time, so i based the application on shallow methods. In an attempt to not reproduce the few source out there i tried to use and combine as many algorithms as possible. The system produces for each sentence three scores, terms, position and keywords which can be used with different weights to evaluate easy score.


For the Terms score the user can choose between:

  1. TF-ISF (Term Frequency - Inverse Sentence Frequency) 
  2. TF-IDF (Term Frequency - Inverse Document Frequency)
  3. TF-RIDF (Term Frequency - Residual Inverse Document Frequency)

Are you still here reader ? Ok stay with me i will be quick. Basically every method produces a score for each word based either:

  1. On all the words inside the document
  2. On all the words inside a collection of documents
  3. 2 + Poisson Distribution Model



For the Position score the user can choose between:

  1. Baxendale's research
  2. News article

Baxendale was a researcher who concluded that in 85% of the paragraphs the topic sentence came as the first one and in 7% of paragraphs the last sentence was the topic sentence. Thus, a naive but fairly accurate way to select a topic sentence would be to choose one of these two. The News Articles algorithm basically scores sentences dynamically and clearly favours the first sentences of the first paragraphs.

The Keywords score is actually a cheat, by providing keywords the system can scores sentences that contain them higher than others. This way to system finds the key sentences easier instead of guessing like the above scoring methods. Additionally the user can select to set a words per sentence threshold, so the system can ignore too big or too small sentences. The system also uses stop words lists to ignore common terms as well a greek language stemming algorithm to group words. 

The application is hosted here http://thesis.t3-design.com/, soon to be hosted on this domain. If you are not Greek you will probably don't understand much but hey it was my first Zend Application and i am really proud about it and please take a look at the credits section, i couldn't have done it without them.

You can download my paper here, once again it's in Greek but don't worry if you want the gist of it my supervisor teachers published a scientific paper based on my work, you can find it  here.

Time for some pictures (new files system in place ;p)