[ale] Two offices, one data pool

Michael B. Trausch mike at trausch.us
Thu Feb 17 10:13:43 EST 2011


On Thu, 2011-02-17 at 07:40 -0500, Jim Kinney wrote:
> put the data source where both offices have crappy access speed.

Well, I had thought about putting something together, but I don't know
how well it'd do.

What I've been thinking about is some method by which the existing Samba
infrastructure could be used in order to ensure that both sides have
locks and such which are shared between them.  For example, if all of
the data is stored on Amazon's S3 service, and then a VFS plugin was
written for Samba that would provide access to the data stored on S3 as
a shared filesystem, I thought it might be possible to use the whole
thing in a way so as to keep the requirement that two people should not
be able to attempt to modify the document at once.

For the moment, everything's in one office.  So if someone opens a
spreadsheet, and then someone else opens the same spreadsheet, the
second user gets the spreadsheet in read-only mode, because the first
person has it open and protected against writes by other people.

The other bonus would be that if it's done using Samba's VFS support, it
would look like it would be possible to support the whole Windows way of
doing things with relative ease.  Security descriptors and all that jazz
could be stored as metadata attached to the files, as well as things
like file locks.

The only other problem that would need to be solved would be that of
caching; documents that are frequently accessed should be cached locally
so that round-trips to the Internet can be deferred to an extent.  A
method for invalidating the cache would also be required; if document X
is in the local cache in office A, and someone in office B modifies the
document, there should be some way for the server in office A to become
aware of that fact and invalidate its copy of it (either fetching the
updated copy and inserting that into its cache, or just dropping it
entirely).

I'd (briefly) considered something like DRBD, but I just can't see that
being functional enough without a leased line being used to make things
work well.  I've no interest in attempting to resolve issues like the
offices falling out of sync with each other.  It has to be stupid-easy
for people to use, and stupid-easy for me to manage.  That's really the
goal.

> Given the desktop clients they will be using most likely M$office. So
> "shared data" means word files. To prevent clashes you need a document
> management tool. Knowlegetree is one. Alfresco is another.

I will look into both of those.  Do those types of systems have some
sort of method by which Samba could expose their repositories as shared
filesystems?

> You _could_ set up a subversion repo and scripts to merge xml
> docs .... 

Eeeeeeee, no.  bzr, maybe.  :-P

But they're not all XML documents, anyway.  There is an awful lot of
legacy binary blob document types, and even the newer XML using ones are
wrapped up in binary files such that automatic merging would become
problematic, I think.  One would have to have some sort of method for
really deeply inspecting the document(s) and finding out how to easily
diff them and all their content with the previous version.  Given the
number of Office 2003 and earlier versions of documents that are
present, that would be challenging at best.

	--- Mike
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
Url : http://mail.ale.org/pipermail/ale/attachments/20110217/61471075/attachment-0001.bin 


More information about the Ale mailing list