Performance and Profiling:
- Template caching:
return render('base/csubjects.html', cache_type='memory', cache_expire=60)
This keeps the template output in memory cache for a period of 60 sec. This has resulted in a huge performance gain.
Also, cache_expire time can be more than 60 sec. in our case.
- Turn off debug in .ini and dont use –reload when starting the server
- Turn off Mako’s template checking:
Another important flag on TemplateLookup is filesystem_checks. This defaults to True, and says that each time a template is returned by the get_template() method, the revision time of the original template file is checked against the last time the template was loaded, and if the file is newer will reload its contents and recompile the template. On a production system, setting filesystem_checks to False can afford a small to moderate performance increase (depending on the type of filesystem used).
Profiling using repoze.profile:
- Integrated repoze.profile with ArchiverUI to see which portion of the code is taking the longest. Right now, it seems that hide_quoting component can be further optimized.
- Functional Testing with Nose:
Added some tests to check basic responses of different pages of ArchiverUI. For example, make sure the response is 404 if any invalid url is passed and response contains correct page for every valid url.
- Made a few modifications to code more readable and started adding docstrings
class Mlist(): """Class to represent an mlist table in a sqlite database""" __storm_table__ = "mlist" id = Int(primary = True) list_name = Unicode() # Path to the archives database corresponding to the list_name mlist db_path = Unicode()
- whenever a new list is created, update mlist table
- whenever a message is archived, update the database file corresponding to that mlist.
- Generalize for more than one mailing list
I have added support for searching in archiverUI. Most of the logic/code that was written by Priya as part of last year’s GSOC has been reused with some changes.
- Indexing: For the first time, indexing is done for all the messages stored in sqlite database file. After that, every time index_archives() is called, only new messages (that are added after the last call to index_archives() ) are indexed.
- Index Schema:
fields.Schema ( msgid=fields.ID(stored=True, unique=True),
Here, we only need to store msgid (stored = True) as we can query the database with msgid to get other parameters. But this may result in additional overhead to query database in order to show metadata of results of searching. For now lets leave it in this state, if it turns out to be the reason of poor performance then I’ll store author, subject and body as well.
As Barry had suggested, I have started looking at writing tests and documentation. I have been following this excellent guide for testing and documentation for pylons.
I have finished the basic implementation of archiver UI using pylons framework and Storm ORM. Though it still requires some minor fixes, it covers:
1. conversation-list view with quick view of conversation messages(using ajax)
2. conversation page with indentation and quote hiding feature.
Code can be found at: http://code.launchpad.net/~dushyant37/+junk/ArchiverUI
Egg package: http://bazaar.launchpad.net/~dushyant37/+junk/ArchiverUI/files/head:/ArchiverUI/dist/
Other info: README.txt
ArchiverUI package just requires a sqlite database file. But currently, the test database file that I am using has not been generated from any list archives (mbox file), so, it is not good for demo purpose.
Initially, I wasn’t sure whether I should focus on generating a database for demo purspose. But then, I realised it would be good to get feedback from mailman community. So, I started working on this and now, I am about to finish it.
Overall, generating pages dynamically seems to be a better way to view archives. I would like to discuss some other issues before proceeding further.
1. Performance: We need to look at the performance of the new archiver in order to improve it as well as to compare this approach of generating pages on the fly with the static one. Is there any specific way to go about it?
After that, performance gain can be achieved by proper use of caching which is offered by pylons and storm/sqlite.
2. Interaction with mailman: Right now, the interaction with mailman (archiver part) is through a sqlite database file. So, we just need to update the relevant database files on archiving a message through mailman.
If we want to further separate out the archiver from mailman, we can use the methods used by mhonarc and mailarchive interfaces.
3. Search: I also plan to integrate search functionality(work done by Priya) for archives into this pylons project.
We basically need two tables 1) Article 2) Conversation. As Andrew had already figured out the StormArticle object, I added the conversation object as well:
__storm_table__ = “conversation”
con_id = Unicode(primary = True) #or, thread_addr
msgids = List() # List of msg ids which are part of this conversation
But I have been unable to use this List property with sqlite. Also, I realised that we should keep some more information with conversation object inorder to avoid multiple queries to build conversation list page.
So, I just went ahead and assumed these article and conversation objects in order to get started with pylons.
“”” Class to represent an Article in a sqlite database rather than using pickles. “””
__storm_table__ = “article”
archive = Unicode()
sequence = Unicode()
subject = Unicode()
datestr = Unicode()
date = Float()
# headers = List()
author = Unicode()
email = Unicode()
msgid = Unicode(primary=True)
in_reply_to = Unicode()
# references = List()
body = Unicode()
thread_addr = Unicode()
# conversation = Reference(thread_addr, Conversation.thread_addr)
“””Conversation class for sqlite database”””
__storm_table__ = “conversation”
thread_addr = Unicode(primary = True)
subject = Unicode()
datestr = Unicode()
articles = ReferenceSet(thread_addr, StormArticle.thread_addr)
After finalizing (sort of) the schema of the tables, I needed some sample database to fill those tables. So, I added support (to existing pipermail) to create and fill sqlite database from mbox files (only for article table). This way, I got the db file which I needed to develop archiverUI with Pylons
Pylons + Storm:
Developed the very basic framework to generate conversation list and conversation page. Tested only for conversation page. LOTS of things to do. Code can be found at: http://code.launchpad.net/~dushyant37/+junk/ArchiverUI
Right now, there is not interaction with mailman/pipermail; all it requires is just a sqlite file.
While parsing article body from mbox file to store it in database, I got stuck with lots of UnicodeErrors. Its sort of solved now but all the concepts related to unicode and i18n needs to be cleared up.
- Finalize the storm object representation of conversation and article depending on ability to use List with sqlite
- Finished with the basic work of implementation of modified pipermail
- This branch can be found at: http://bazaar.launchpad.net/~dushyant37/mailman/mm3-archive/
- The main task left is to integrate work done by Andrew (converting Pipermail to use Storm/SQLite instead of pickles).
- Other tasks include completion of some unfinished work (I’ll have to discuss these issues with mentors first), changes in UI, testing/debugging. For some time, my main priority would be to complete dynamic generation of pages.
- Decided to use pylons framework. Pylons is completely new to me (though I am familiar with django).
- Our requirement of using Storm ORM alone dictated this decision.
Andrew is working on converting Pipermail to use Storm/SQLite instead of pickles to save data. So, the framework needs to have support for Storm ORM. Django has its own ORM and Storm cannot be used with it natively (Though storm is said to have support for django, I could not find anything further to try/test Storm with Django).
Pylons doesn’t provide any ORM and I have tested a small application using Storm.
- Implemented a simple page using Storm to get started with pylons and test Storm integration.
- Started writing code to view conversations index.