Storm Migration Guide¶
Important
This document has been migrated from our old wiki as is, and has not yet been revised. The content might be outdated, links and images could be broken. We are aware and will fix any issues as soon as possible.
This guide explains how certain SQLObject concepts map to equivalent Storm concepts. It expects a level of familiarity in how SQLObject works (or at least how it is used in Launchpad). It is not a full tutorial on how to use Storm either - see https://storm.canonical.com/Tutorial for that.
Differences¶
Now that we’ve landed the Storm code, Launchpad is running on top of Storm’s SQLObject compatibility layer. This is not the end of the story though, since we want to move to using the native Storm API. Due to the way the compatibility layer is structured it is possible to start using many of Storm’s native APIs right away, so this will be a gradual process rather than a single big change as with the first stage.
Connections¶
With SQLObject, each database class has a connection associated with it which is used for loading objects and performing queries. With Storm, the equivalent concept is a Store. Unlike SQLObject, stores are bound to instances rather than their classes. This means that a single class can be used to refer to objects in multiple databases (or to objects in the same database over different DB connections, as you might want to do in tests).
There are two main ways to access the main store. One is explicitly via
the IStoreSelector
utility:
#!python
from lp.services.database.interfaces import (
DEFAULT_FLAVOR,
IStoreSelector,
MAIN_STORE,
)
store = getUtility(IStoreSelector).get(MAIN_STORE, DEFAULT_FLAVOR)
Use the master flavor if you need to update the objects. Use the slave flavor to offload a search to a replica database and don’t mind the search being made on data a few seconds out of date. Use the default flavor if you don’t need to make changes, but need an up to date copy of the database (e.g. most views, as the object you are viewing might just have been created) - Launchpad will choose an appropriate flavor.
The other method is from an existing object:
#!python
from storm.store import Store
store = Store.of(some_object)
The second form is often more convenient, and is preferred if you don’t need to make updates and want them to play nicely with objects from an unknown store (e.g. passed in via your method parameters).
Utility methods and Stores¶
If you are writing a utility method like MailingListSet.get, use the default store. Utility methods can’t know whether the caller will be writing to objects it retrieves. But the default choice makes pretty good guesses about whether your operation needs the master store. It uses the master store:
If you are in a non-web context, like a batch job
If you are doing a POST (which means your overall operation may write)
If you are doing a GET, but have recently written (which means the slaves may not have your latest changes).
So the only times you’ll run into trouble are if:
a GET operation writes to the database
a GET operation relies on data that was written to the database by another GET
a GET operation relies on data that was written to the database by another browser instance.
We plan to address these issues better once we’re using Python 2.5 and its support for with statements / context management.
Adding Objects¶
/!\ Note: this section applies to classes that are not defined using the compatibility layer. Classes using the compatibility layer continue to provide the SQLObject behaviour.
With SQLObject, a default constructor is added to database classes that inserts a new row in the database that takes column values as keyword arguments. Storm does not provide a default constructor, so classes will need to add one.
Furthermore, Storm does not add the object to the database on instantiation: that must be done separately. There are two ways that an object can be added to a store. It can be added directly:
#!python
store.add(object)
Or you can link it to an existing object, which will add it to that object’s store:
#!python
owner = getUtility(IPersonSet).getByEmail('[email protected]')
product = Product(owner=owner) # product added to owner's store
Removing Objects¶
Objects can be removed from the database using the `Store.remove` method. To remove an object from its store, you can use:
#!python
from storm.store import Store
Store.of(some_object).remove(some_object)
Getting Objects by ID¶
The equivalent of SQLObject’s `Class.get()` method is `Store.get`. It takes the class and the primary key of the object as arguments:
#!python
store = getUtility(IZStorm).get('main')
person = store.get(Person, 42)
Querying Objects¶
The equivalent of SQLObject’s select
, selectBy
, selectOne
,
selectOneBy
, selectFirst
and selectFirstBy
methods is
Store.find()
. It acts quite similar to the equivalent SQLObject
methods, and the following are equivalent:
#!python
result = store.find(Person, displayname='Some guy')
result = store.find(Person, Person.displayname == 'Some guy')
result = store.find(Person, "person.displayname = 'Some guy'")
Note that the “.q.” bit is not required in the second example. The
first two versions are preferred to direct SQL since they allow Storm to
determine which tables are being used in the query automatically. As
with SQLObject, no query is issued when executing find()
: that is
delayed until you try to access the result set.
The behaviour of selectOne
and selectFirst
are covered by the
one
and first
methods on the result set. You can chain them with
the find
call if it is appropriate:
#!python
# Raises NotOneError if there is more than one item in the result set
person = store.find(Person, displayname='Some guy').one()
# Raises UnorderedError if the result set has no order
person = store.find(Person, displayname='Some guy').first()
# Like first() but doesn't complain about unordered result sets
person = store.find(Person, displayname='Some guy').any()
Result sets can be indexed, sliced and iterated over as with SQLObject. An ordering can be applied to the result set with the `order_by` method:
#!python
result.order_by(Person.name, Person.id)
Unlike SQLObject, the ordering is applied to the result set rather than
creating another one. The method does return the result set though, to
make it possible to chain the calls when constructing a result set.
Similar to SQLObject, a table can specify the default ordering for
results with the __storm_order__
class attribute.
See the storm.store.ResultSet
doc strings and the Storm tutorial for
more details on what is possible.
Defining Tables¶
Some of the primary differences between SQLObject and Storm database class definitions are:
Subclass from
lp.services.database.stormbase.StormBase
instead oflp.services.database.sqlbase.SQLBase
. (Subclassingstorm.base.Storm
also works in most cases, butStormBase
adds astorm_invalidate
hook for cached properties.)Use the
__storm_table__
attribute to set the table name instead of_table
.The primary key must be defined explicitly. This will usually look like:
#!python
id = Int(primary=True)
The class should have a constructor if appropriate (some classes like
BugSubscription
may not need one). Note that the constructor should not usually add the object to a store – leave that for aFooSet.new()
method, or let it be inferred by a relation. Barry Warsaw: what if there is no ``FooSet`` or relation? See question below.Default result set ordering should be set using the
__storm_order__
property rather than_defaultOrder
.Use the column definition classes are found in `storm.properties`, and do not use the
Col
suffix. In general, they will follow Python’s type naming conventions rather than SQL’s (e.g. TimeDelta rather than Interval).There is no equivalent of
alternateID=True
. TheStore.find()
method provides equivalent functionality to thebyColumnName
methods generated by this argument.To specify that a column can not contain NULLs, use
allow_none=False
rather thannotNull=True
. Note that if NULLs are found in such columns,NoneError
will be raised.If no
default
is specified for a column, the database default will be used. Sodefault=DEFAULT
or similar can be removed.Be sure your table has a
PRIMARY KEY
constraint defined, otherwise yourid
column will not get set automatically and you will get anIntegrityError
from PostgreSQL.
Foreign Key References¶
The equivalent of SQLObject’s ForeignKey
class is Reference
. A
Storm Reference
property creates a relationship between a local
column and a remote column. Unlike ForeignKey
, it does not implicitly
create the FK column. So the following definitions are equivalent:
#!python
# SQLObject
owner = ForeignKey(foreignKey='Person', dbName='owner')
# Storm
ownerID = Int('owner')
owner = Reference(ownerID, 'Person.id')
The columns can be passed directly to Reference(), or can be passed as strings that are looked up on first use.
The Reference
class is also used to replace SQLObject’s
SingleJoin
class:
#!python
# SQLObject
import_job = SingleJoin('CodeImportJob', joinColumn='code_importID')
# Storm
import_job = Reference(id, 'CodeImportJob.code_importID', on_remote=True)
Reference Sets¶
The SQLMultipleJoin
and SQLRelatedJoin
classes are replaced by
Storm’s ReferenceSet
:
#!python
# SQLObject
subscriptions = SQLMultipleJoin('QuestionSubscription', joinColumn='question')
subscribers = SQLRelatedJoin('Person',
joinColumn='question', otherColumn='person',
intermediateTable='QuestionSubscription', orderBy='name')
# Storm
subscriptions = ReferenceSet(id, QuestionSubscription.questionID)
subscribers = ReferenceSet(id, QuestionSubscription.questionID,
QuestionSubscription.personID, Person.id,
order_by=Person.name)
While the SQLObject properties return plain result sets, the Storm
properties return BoundReferenceSet
objects. Some differences
include:
add(obj)
andremove(obj)
methods are provided for adding and removing objects from the set. These are roughly equivalent to the automaticaddFoo()
andremoveFoo()
methods that SQLObject generates. For reference sets that join through a third table, Storm will take care of inserting and deleting rows as needed.A
find()
method is provided for searching for objects within the reference set. This behaves a lot likeStore.find()
without the first argument.
Property Setters / Validators¶
SQLObject provided two ways of controlling how variables were set:
magic
_set_columnName()
methods.the validator argument on column definitions.
Storm does not support magic methods but does have validators (albeit in
a simpler form than SQLObject). A validator is a function that takes
(object, attr_name, new_value)
as arguments and returns the value
that should be set. This allows validation to be performed on the new
value (by raising an exception on bad values), and transformation of the
value if appropriate (by returning something other than new_value
).
A validator can be set for a column with the validator
argument in
the column definition.
You may notice some uses of storm_validator
in code using the
compatibility layer. As the compatibility layer does not implement the
either of the SQLObject validation APIs, this was done to allow use of
Storm validators without completely rewriting the definitions.
Prejoins¶
Storm’s equivalent of prejoins is tuple finds. To select all products
that are part of launchpad-project
and their owners, we can do:
#!python
launchpad_project = store.find(Project, name='launchpad-project')
result = store.find((Product, Person),
Product.project == launchpad_project,
Product.owner == Person.id)
Iterating over this result will give us (product, person) tuples. The above case performs an inner join, so is not appropriate for cases where the foreign key linking the tables can be NULL. In those cases, a slightly different syntax is needed:
#!python
from storm.expr import LeftJoin
result = store.using(LeftJoin(Product, Project, Product.project == Project.id)).find(
(Product, Project))
This result set will return (product, project) tuples, with project set to None where appropriate.
If you need to select a table multiple times, it is necessary to alias it. For example:
#!python
from storm.info import ClassAlias
Driver = ClassAlias(Person, 'driver')
result = store.using(LeftJoin(Product, Driver, Product.driverID == Driver.id), Person).find(
(Product, Person, Driver), Product.owner == Person.id)
This result set will return (product, owner, driver) tuples.
Direct SQL Queries¶
To perform direct SQL queries, we previously used the cursor()
function from lp.services.database.sqlbase
to get a cursor on the
connection being used by SQLObject. These uses should be converted to
use Store.execute()
, which will make sure pending changes have been
flushed to the database first in order to stay consistent.
This method returns a result object with get_one
and get_all
methods that act like a cursor’s fetchone
and fetchall
methods.
It also supports iteration.
#!python
result = store.execute("SELECT name FROM person ORDER BY name")
names = result.get_all()
Migration Plan¶
A good order to migrate code is:
Convert column properties to use the Storm syntax. This should be a no-op change, and not affect external code.
Convert
ForeignKey()
definitions to an appropriate pair ofInt()
andReference()
definitions.Convert
sync()
,syncUpdate()
,destroySelf()
, etc calls to Storm equivalents.Convert uses of
Class.select*()
to usefind()
. Note that you lose prejoins support here, so use tuple finds as appropriate. Change queries to use Storm expressions rather than sqlbuilder expressions.Convert
SQLMultipleJoin
andSQLRelatedJoin
toReferenceSet()
. As this changes the API of the class a bit, it will probably require changes external to the class.Change the class to derive from
lp.services.database.stormbase.StormBase
instead ofSQLBase
.
This list is roughly ordered based on the locality of changes and based on dependencies between changes.
For new code, consider using native Storm API from the start, rather than continuing to use the compatibility layer.
Tips on Converting Tests¶
From “Tips in converting tests to Storm”, May 30, 2008,
Below are some tips on writing Storm code for Launchpad. I won't go
too deep into the Storm API, and instead concentrate on some of the
differences between SQLObject and Storm's SQLObject compatibility
layer.
1. Storm is stricter with respect to the types it accepts in various
situations. Most of the cases where this has caused problems in tests
have indicated problems in Launchpad or its tests. Below are a number
of the common problems I've encountered:
(a) The SQLObject EnumCol accepts values other than enumeration values
on the Python side. With the upgrade to Storm, things are a bit
stricter, and the correct enumeration values need to be passed in.
I found a few cases where some code had an enumeration value and then
passed in item.value. In a few tests, the numeric constants were
being used.
This affects both creating/updating objects and building queries.
(b) When assigning to a foreign key attribute, you need to assign an
object of the right type. It seems that SQLObject would accept any
class for such assignments. And provided a row from the correct table
existed with the same ID existed you'd get no complaints from the
database. There were a few tests that had bugs like this.
(c) SQLObject lets you assign a result set to a foreign key reference
when creating or updating an object, while Storm does not.
This might sound like a missing feature, except for the fact that
SQLObject seems to treat result sets as NULLs when generating SQL. So
any new errors caused by this are genuine errors.
2. Storm flushes changes to the database implicitly before various
operations. In general, this is good since you don't need to remember
to flush changes before running select(). That said, we have a number
of cases where we have code that relies on changes not being flushed
to disk. Some examples include:
* the code to warn about assigning bugs to non developers transitioned
to the new assignee and then checked to see if the new assignee had
any bugs assigned to them. As Storm flushed the change in assignee,
it always looked like the user had assigned bugs. Switching the order
of these two operations fixed the bugs.
* Some of the PPA tests would set a PPA to private and then set the
buildd password. Database constraints require that private PPAs have
a password, so in some tests where a flush occurred between the two
operations an IntegrityError was raised. Reordering the two
statements fixed the problem.
Of course, there are cases where it is useful to have implicit flushes
turned off. There is an API to block implicit flushes, and I've made
use of it for our security policy (which could otherwise introduce
flushes to almost any attribute access) and most event subscribers.
I've done this in most cases with the
lp.services.database.sqlbase.block_implicit_flushes function decorator.
3. Storm flushes some changes later than SQLObject. Namely inserts or
deletes to the database. Furthermore, the order that objects are
added in a single flush is not defined. This exhibits itself in two
ways:
* If inserting a row would cause an integrity error, that error will
occur at flush time rather than object construction time. Tests for
such failures need to explicitly flush the object.
* Some tests would create a number of objects in a single flush group
and expect them to have IDs in the same order. Such tests need to
either take this into account or add explicit flushes to preserve the
ordering.
* In cases where objects are created that reference each other in a
loop, a manual flush will be needed before closing the loop.
Otherwise Storm won't know what order to insert them in.
4. sqlbuilder expressions do not yield SQL from str(). Storm uses
quite a different method to convert sql expression objects to SQL
statements, and this is a result of that. There were a few cases of
code that took a builder expression and substituted it into a string
to form a larger query. I've generally fixed cases like this by
converting the string expression to builder objects.
Constant expressions (e.g. UTC_NOW) will still work with sqlvalues(),
but not when substituted directly.
James.
Questions¶
12-Aug-2008
Some of our ForeignKey columns had
notNull=True
but Storm’s Reference class does not acceptallow_none=False
keyword argument.Put the
allow_none=False
on theInt
rather than on theReference
.
How to actually convert a UtcDateTimeCol to a DateTime? For now, I’m using a DateTime with
tzinfo=pytz.timezone('UTC')
keyword argument. Also, doesdefault=UTC_NOW
still work?Use
default_factory=datetime.utcnow
instead.
Can I still use EnumCol, or is there a better way to hook up with our DBEnums?
Try
lp.services.database.enumcol.DBEnum
.
03-Oct-2008
I’m still confused about the right way to add an object to a store. If I’m using native Storm APIs (as all new code should, right?) should I add a
Store.add()
call my database object’s__init__()
? That seems to be the most straightforward translation of the SQLObject compatibility layer. And if the answer is “yes”, then how do I get the Store to use? I could useStore.of(someobj).add(self)
butsomeobj
might not be in the right store. I could use thegetUtility()
trick, but it seems wrong that a database module should be importing an interface fromwebapp
.