Archive

Archive for June, 2011

ArchiverUI: Using Pylons with Storm ORM

June 29, 2011 Leave a comment

Storm Objects:

We basically need two tables 1) Article  2) Conversation. As Andrew had already figured out the StormArticle object, I added the conversation object as well:

class Conversation(object):
__storm_table__ = “conversation”
con_id = Unicode(primary = True) #or, thread_addr
msgids = List() # List of msg ids which are part of this conversation

But I have been unable to use this List property with sqlite. Also, I realised that we should keep some more information with conversation object inorder to avoid multiple queries to build conversation list page.

So, I just went ahead and assumed these article and conversation objects in order to get started with pylons.

class StormArticle(object):
“”” Class to represent an Article in a sqlite database rather than using pickles. “””
__storm_table__ = “article”
archive = Unicode()
sequence = Unicode()
subject = Unicode()
datestr = Unicode()
date = Float()
#    headers = List()
author = Unicode()
email = Unicode()
msgid = Unicode(primary=True)
in_reply_to = Unicode()
#    references = List()
body = Unicode()
thread_addr = Unicode()
#    conversation = Reference(thread_addr, Conversation.thread_addr)

class Conversation(object):
“””Conversation class for sqlite database”””
__storm_table__ = “conversation”
thread_addr = Unicode(primary = True)
subject = Unicode()
datestr = Unicode()
articles = ReferenceSet(thread_addr, StormArticle.thread_addr)

Sqlite Database:

After finalizing (sort of) the schema of the tables, I needed some sample database to fill those tables. So, I added support (to existing pipermail) to create and fill sqlite database from mbox files (only for article table). This way, I got the db file which I needed to develop archiverUI with Pylons

Pylons + Storm:

Developed the very basic framework to generate conversation list and conversation page. Tested only for conversation page. LOTS of things to do. Code can be found at: http://code.launchpad.net/~dushyant37/+junk/ArchiverUI

Right now, there is not interaction with mailman/pipermail; all it requires is just a sqlite file.

Unicode:

While parsing article body from mbox file to store it in database, I got stuck with lots of UnicodeErrors. Its sort of solved now but all the concepts related to unicode and i18n needs to be cleared up.

IMMEDIATE TODO:

  • Finalize the storm object representation of conversation and article depending on ability to use List with sqlite

New approach: Dynamic page generation

June 22, 2011 Leave a comment

Static HTML:

TODO:

  • The main task left is to integrate work done by Andrew (converting Pipermail to use Storm/SQLite instead of pickles).
  • Other tasks include completion of some unfinished work (I’ll have to discuss these issues with mentors first), changes in UI, testing/debugging. For some time, my main priority would be to complete dynamic generation of pages.

Dynamic pages:

  • Decided to use pylons framework. Pylons is completely new to me (though I am familiar with django).
  • Our requirement of using Storm ORM alone dictated this decision.
    Andrew is working on converting Pipermail to use Storm/SQLite instead of pickles to save data. So, the framework needs to have support for Storm ORM. Django has its own ORM and Storm cannot be used with it natively (Though storm is said to have support for django, I could not find anything further to try/test Storm with Django).
    Pylons doesn’t provide any ORM and I have tested a small application using Storm.

  • Implemented a simple page using Storm to get started with pylons and test Storm integration.
  • Started writing code to view conversations index.

June 12-15, 2011

June 15, 2011 Leave a comment

Dynamic Pages:

Started with mod_wsgi, with my own wsgi application. Looked at a nice tutorial for developing wsgi app by paste( framework for developing web framework,)

  • Inside mailman source code: REST server using wsgiref and application using restish
  • Still confused over whether to use framework:
  • It will be very helpful to use framework. Basically, we need to show these three types of pages:
    1. Conversation list or thread — by pages or by months
    2. View a conversation
    3. Everything related to search:-> mailocate + search by period
  • Framework requirements:
    • Require without ORM as we can use Storm
    • If mailman uses some framework already(restish or django by benste), then use that.
    • Lightweight
    • Choices: Django and web2py.

Static Html pages:

I am trying to finish all the code related to static page generation.

  • Downloaded some .mbox file and tested archive_message() on message retrieved from that.
  • As I had changed the code of pipermail to minimize the work done when adding a new message to archives. It turns out that specific case of archiving a full mbox file needs to be taken care of. Implemented that. Tested on some small mbox files.

June 11-12, 2011

June 12, 2011 Leave a comment

Dynamic pages:

  • Read the basics of storm/sqlite
  • Got the basic understanding of Andrew’s approach through his blog.
  • Configured apache and tested python scripts with CGI and mod_wsgi. It seems that mod_wsgi offers many advantages compared to CGI. Though, for mailocate project, CGI was used.

Static Html

  • Update the html file of the conversation list: the entry corresponding to the updated conversation needs to be updated as well. ✔
  • Some issues with hide_quotes ✔
  • indentation in conversation ✔
  • differentiate between article.body() and article.html_body() ✔
  • clean up the code ✎
    • merge _in_reply_to and in_reply_to fields
  • commit.. ✔
  • Write down instructions and other notes to be able to install and see my modified code in a working condition ✔✎

June 6-9, 2011

June 10, 2011 Leave a comment
  • Update the html file of the conversation list: the entry corresponding to the updated conversation needs to be updated as well. ✔✎
  • Some issues with hide_quotes ✔
  • indentation in conversation ✎
  • differentiate between article.body() and article.html_body() ✔
  • clean up the code ✔✎
  • commit.. ✔
  • Write down instructions and other notes to be able to install and see my modified code in a working condition ✔✎

✔: finished | ✔✎: unfinished | ✎: todo

I hope to complete all these unfinished tasks by today. Then, I’ll discuss my overall strategy with mentors.

June 2-6, 2011

June 6, 2011 Leave a comment
  • in archive-ui, when a message is archived, all the conversations are generated again. Ideally, we should do minimum amount of work on archiving a message. Finalised a strategy to achive this.
  • Implemented it.
    • Similar to {date, subject, article, author, thread}, keep a new dumbBTree of conversations which stores the mapping: thread_addr –> list of msgids present in that conversation
    • when a new message is archived, conversation file corresponding to its thread_addr is built from list of msgids.
  • TO-DO:
    • Update the html file of the conversation list: the entry corresponding to the updated conversation needs to be updated as well.
    • Some issues with hide_quotes and indentation in conversation
    • differentiate between article.body() and article.html_body()
    • clean up the code
    • commit..

May 31 – June 1, 2011

June 2, 2011 Leave a comment
  • Thought about lots of issues about archive-ui and discussed them on mailing list.
  1. Arranging conversations by months: In pipermail, all the static htmls generated for each message(that goes into archives) are arranged in directories corresponding to different months.
    Whereas in archive-ui, all the conversations are stored in a single directory. The reason is that since a conversation can span across months, it is somewhat ambiguous where to keep that conversation.Conclusion:

    1. No advantage of arranging messages on a monthly basis.
    2. Implement  “searching for messages by period” functionality. This was not implemented in archive-ui but a use case and solution was developed.
  2. Readable urls:URL corresponding to a conversation can be created from subject of the first message of a conversation (in the same way as various blog sites do).Conclusion:
    1. Some part of it is already implemented in archive-ui. I’ll leave it for now.
  3. in archive-ui, when a message is archived, all the conversations are generated again. Ideally, we should do minimum amount of work on archiving a message.Conclusion:
    1. No decision on this. Will try to discuss about it more.

mailocate -> Work by Priya to implement search capabilities in mailman.

  • Wrote the basic function for “search by date” functionality. Checked out the code of “mailocate”. The interface for “search by date” will also be implemented in mailocate.
  • Updated some more templates of pipermail to match archive-ui.
  • Handling Top/Bottom Posting
    By default, the quoted text can either be all hidden or all displayed. It might also be good to only hide the quoted text when it is at the end of the message, as when it is in the middle, the user is likely to want to see it for context.Started working to test this on mm3 and also, to account for one more test case:
    ->On mailing lists, people generally use ‘inline posting’ to reply. One level of inline posting is helpful to see the context, but if it goes beyond one level then, it might be a good idea to hide the old quoted text