Archive

Posts Tagged ‘mailman’

Mailman3, Archiver and Archiver-UI

August 19, 2011 1 comment

  • Mailman3: MM3 sends a message to a local Archiver instance via a subprocess call.
    Status: Updated archive_message() function.

        def archive_message(mlist, message):
        #        """See `IArchiver`."""
            command = 'python /home/dushyant/projects/Archiver/archiver.py -basedir 
                        ~/projects/Archiver/var -listname mlist.fqdn_listname add'
            proc = subprocess.Popen(
                command.split(), stdout= subprocess.PIPE, stderr=subprocess.PIPE,
                stdin = subprocess.PIPE,
                shell=False)
            stdout, stderr = proc.communicate(message.as_string())
            if proc.returncode <> 0:
                log.error('%s: archiver subprocess had non-zero exit code: %s' %
                          (message['message-id'], proc.returncode))
            log.info(stdout)
            log.error(stderr)

    Modifications in configuration schema and updation of pipermail docs needs tobe done.

  • Archiver: Archiver separated out from mm3, generates and maintains sqlite database and whoosh index from the archived messages.
    Status: Basic implementation is finished. Need to discuss with Barry about i18n and language preference of message. Also, requires some minor fixes and code clearing.
  • Archiver-UI: To view and search archives through web interface. This is built on pylons framework. All it needs is access to sqlite database and whoosh index maintained by Archiver.
    Status: More or less finished. Just requires a few improvements in UI.
Advertisements
Tags: ,

Integrating mailman and archiver through a command line api

August 15, 2011 Leave a comment

Status:

  • Currently, only latest messages by thread is shown. I am adding support to show messages by date
  • Highlighting message in the whole conversation(thread)
  • Navigation:
    From one conversation(thread) to next
    from one message to other (by date and by thread)
  • Order by: View messages ordered by subject, date etc.
  • Improving conversion script (mbox to sqlite db file)
  • Generate a better url and thread_addr for conversation page from message subject
  • Refer to a message with id and not msgid
Integrating mailman and archiver:
  • Discussed with Barry and decided to integrate mailman and archiver through a command line api (as done by mhonarc archiver) . Right now, I am working on its implementation.
Tags: ,

Some more things to do:

August 6, 2011 Leave a comment

I hosted archiverUI on aws and showed it to Anna. Along with her suggestions and some things that I had in my mind, I prepared a todo list.

These are the things that I am working on:

  • Currently, only latest messages by thread is shown. I am adding support to show messages by date
  • Highlighting message in the whole conversation(thread)
  • Navigation:
    From one conversation(thread) to next
    from one message to other (by date and by thread)
  • Order by: View messages ordered by subject, date etc.
  • Improving conversion script (mbox to sqlite db file)
After I am done with these todos, I’ll show the demo to other mailman developers.
Tags: ,

Profiling, Testing and Documentation

July 25, 2011 Leave a comment

Performance and Profiling:

Pylons:

    • Template caching:
              return render('base/csubjects.html', cache_type='memory',
                             cache_expire=60)

      This keeps the template output in memory cache for a period of 60 sec. This has resulted in a huge performance gain.
      Also, cache_expire time can be more than 60 sec. in our case.

    • Turn off debug in .ini and dont use –reload when starting the server
    • Turn off Mako’s template checking:
      Another important flag on TemplateLookup is filesystem_checks. This defaults to True, and says that each time a template is returned by the get_template() method, the revision time of the original template file is checked against the last time the template was loaded, and if the file is newer will reload its contents and recompile the template. On a production system, setting filesystem_checks to False can afford a small to moderate performance increase (depending on the type of filesystem used).

Profiling using repoze.profile:

  • Integrated repoze.profile with ArchiverUI to see which portion of the code is taking the longest. Right now, it seems that hide_quoting component can be further optimized.
Testing:
  • Functional Testing with Nose:
    Added some tests to check basic responses of different pages of ArchiverUI. For example, make sure the response is 404 if any invalid url is passed and response contains correct page for every valid url.
Documentation:
  • Made a few modifications to code more readable and started adding docstrings
Integrating with Mailman:
Currently, ArchiverUI just requires a sqlite db file containing tables for articles and conversations. I have kept one more table in db containing information about all the mailing lists.

class Mlist():
    """Class to represent an mlist table in a sqlite database"""
    __storm_table__ = "mlist"
    id = Int(primary = True)
    list_name = Unicode() 

    # Path to the archives database corresponding to the list_name mlist
    db_path = Unicode()
Now, these thing need to be done:
Mailman:

  • whenever a new  list is created, update mlist table
  • whenever a message is archived, update the database file corresponding to that mlist.
ArchiverUI:

  • Generalize for more than one mailing list

Integrating Search

July 19, 2011 Leave a comment

I have added support for searching in archiverUI. Most of the logic/code that was written by Priya as part of last year’s GSOC has been reused with some changes.

  • Indexing: For the first time, indexing is done for all the messages stored in sqlite database file. After that, every time index_archives() is called, only new messages (that are added after the last call to index_archives() ) are indexed.
  • Index Schema:
    fields.Schema ( msgid=fields.ID(stored=True, unique=True),
    author=fields.TEXT(),
    subject=fields.TEXT(analyzer=StemmingAnalyzer()),
    body=fields.TEXT(analyzer=StemmingAnalyzer()),
    )
    Here, we only need to store msgid (stored = True) as we can query the database with msgid to get other parameters. But this may result in additional overhead to query database in order to show metadata of results of searching. For now lets leave it in this state, if it turns out to be the reason of poor performance then I’ll store author, subject and body as well.

Unit Test:

As Barry had suggested, I have started looking at writing tests and documentation. I have been following this excellent guide for testing and documentation for pylons.
http://pylonsbook.com/en/1.1/testing.html

ArchiverUI Status

July 11, 2011 Leave a comment

Status:

I have finished the basic implementation of archiver UI using pylons framework and Storm ORM. Though it still requires some minor fixes, it covers:
1. conversation-list view with quick view of conversation messages(using ajax)
2. conversation page with indentation and quote hiding feature.

Code can be found at: http://code.launchpad.net/~dushyant37/+junk/ArchiverUI
Egg package: http://bazaar.launchpad.net/~dushyant37/+junk/ArchiverUI/files/head:/ArchiverUI/dist/
Other info: README.txt

ArchiverUI Demo:
ArchiverUI package just requires a sqlite database file. But currently, the test database file that I am using has not been generated from any list archives (mbox file), so, it is not good for demo purpose.

Initially, I wasn’t sure whether I should focus on generating a database for demo purspose. But then, I realised it would be good to get feedback from mailman community. So, I started working on this and now, I am about to finish it.

Other issues:

Overall, generating pages dynamically seems to be a better way to view archives. I would like to discuss some other issues before proceeding further.

1. Performance: We need to look at the performance of the new archiver in order to improve it as well as to compare this approach of generating pages on the fly with the static one. Is there any specific way to go about it?
After that, performance gain can be achieved by proper use of caching which is offered by pylons and storm/sqlite.
– As I was working with jquery/javascript to add a little functionality into the archiverUI, I used “Inspect Element” feature provided by Chrome and Firefox(firebug). It also provide statistical data related to loading time and performance of a website. This seems to be a good step to get started.

2. Interaction with mailman: Right now, the interaction with mailman (archiver part) is through a sqlite database file. So, we just need to update the relevant database files on archiving a message through mailman.
If we want to further separate out the archiver from mailman, we can use the methods used by mhonarc and mailarchive interfaces.

3. Search: I also plan to integrate search functionality(work done by Priya) for archives into this pylons project.

ArchiverUI: Using Pylons with Storm ORM

June 29, 2011 Leave a comment

Storm Objects:

We basically need two tables 1) Article  2) Conversation. As Andrew had already figured out the StormArticle object, I added the conversation object as well:

class Conversation(object):
__storm_table__ = “conversation”
con_id = Unicode(primary = True) #or, thread_addr
msgids = List() # List of msg ids which are part of this conversation

But I have been unable to use this List property with sqlite. Also, I realised that we should keep some more information with conversation object inorder to avoid multiple queries to build conversation list page.

So, I just went ahead and assumed these article and conversation objects in order to get started with pylons.

class StormArticle(object):
“”” Class to represent an Article in a sqlite database rather than using pickles. “””
__storm_table__ = “article”
archive = Unicode()
sequence = Unicode()
subject = Unicode()
datestr = Unicode()
date = Float()
#    headers = List()
author = Unicode()
email = Unicode()
msgid = Unicode(primary=True)
in_reply_to = Unicode()
#    references = List()
body = Unicode()
thread_addr = Unicode()
#    conversation = Reference(thread_addr, Conversation.thread_addr)

class Conversation(object):
“””Conversation class for sqlite database”””
__storm_table__ = “conversation”
thread_addr = Unicode(primary = True)
subject = Unicode()
datestr = Unicode()
articles = ReferenceSet(thread_addr, StormArticle.thread_addr)

Sqlite Database:

After finalizing (sort of) the schema of the tables, I needed some sample database to fill those tables. So, I added support (to existing pipermail) to create and fill sqlite database from mbox files (only for article table). This way, I got the db file which I needed to develop archiverUI with Pylons

Pylons + Storm:

Developed the very basic framework to generate conversation list and conversation page. Tested only for conversation page. LOTS of things to do. Code can be found at: http://code.launchpad.net/~dushyant37/+junk/ArchiverUI

Right now, there is not interaction with mailman/pipermail; all it requires is just a sqlite file.

Unicode:

While parsing article body from mbox file to store it in database, I got stuck with lots of UnicodeErrors. Its sort of solved now but all the concepts related to unicode and i18n needs to be cleared up.

IMMEDIATE TODO:

  • Finalize the storm object representation of conversation and article depending on ability to use List with sqlite