scribus

Create Reports with Scribus & Drupal (Part 2)

Posted from: 
St.Gallen

In the previous examples we used the integrated Python scripting extension of Scribus to create reports. We accessed a MySQL database directly, but that situation is just not normally given on the Internet. More often, the database is hidden behind a CMS of some sort. While my experiences come from using Drupal, this should not depend on any specific CMS.

Now, to get the data from the database completely formatted into Scribus we either have to produce a complete SLA file or provide an interim data file that is then again sourced with a Python script in Scribus. To make it easiest for users, simply downloading an SLA file and opening it would be the preferred solution.

The ugly hack variant

Thankfully, Scribus files are a documented XML format (optionally gzipped). However, creating files from scratch is not really an option because invalid files fail silently and validity is often not just a question of being well-formed. With 1.3.6 this seems to have gotten a lot easier since Scribus now pops up a warning with a parse error and a line and column number. (If you want to test it, just insert a plain ampersand into ITEXT's CH.)

Even with that it might no be obvious why a text frame would necessitate all of a dozen parameters to display thus creating files from scratch is not really an option. So, since one probably does not have the time to study that XML format we'll just cut and paste. This works especially well in the first example where we have a single page and simply want to fill in values.

Coversheet

What you intend to cut up and modify are pageobjects. If you open up your SLA file you'll see that a pageobject often comes with 10-30 parameters. The only one shown here is ANNAME, since it's the object's name which you can set in the properties panel.

I recommend setting it within Scribus so you can easily find your object in the SLA file. If you leave the default, the parameter will be empty and the Text4 or Text5 that Scribus reports is just incremented based on the objects present.

<PAGEOBJECT  ANNAME="LoremA">
<ITEXT FONT="DejaVu Sans" FONTSIZE="20" CH="Lorem Ipsum"/>
<para/>
<ITEXT FONT="DejaVu Sans" FONTSIZE="12" CH="Dolor sit amet"/>
<PageItemAttributes/>
</PAGEOBJECT>

From this structure you should be able to splice your XML file together in such a way that you replace the CH parameter with your variable and you should get a valid SLA file that shows your changes.

However, any special characters in your string are likely to cause problems and have to be properly escaped, so run htmlspecialchars() on it first. Furthermore, if you have HTML rather than plain text, you have to strip out those tags, too. With Drupal, check_markup() and check_plain() are a great place to do the heavy lifting.

If you want to preserve the line breaks that <p></p> allow (and probably <br />, too, in many cases), I would recommend to split the string into several substrings and then later reconnect them. Unless you split your string into several strings in those occasions you will likely get confused where to apply htmlspecialchars() and where to substitute and escape. Afterwards, you can splice them together with:

"/>
  <para/>
<ITEXT CH="

Barcode labels

In this example it becomes patently clear that working with PHP's XML DOM would be the smarter option. I hope to be able to follow this post up with a “Part 2b” sometime in the near future that does exactly that but I'm not betting on it.

We begin by taking the example SLA we have from previous attempts. If you start from scratch, style the document as you like and consider having more than one page to get a feel for page structure.

First, the parameter ANZPAGES (sounds like Denglish to me) of the DOCUMENT element is simply the number of pages. For my example with 30 labels per page I got the number from the views module with:

$pagenums=ceil(count($themed_rows)/30);

Next, after the MASTERPAGE elements you have to define all pages. If you have two pages to compare you should be able to figure out that PAGEYPOS and NUM are the two parameters to increment in PAGE. I don't pretend to even remotely understand that neverending canavs, in my case it just turned out that for letter pages having an initial offset of 20 and then incrementing by 792 per page ( PAGEHEIGHT for letter paper) plus adding 40 did the job.

The PAGEOBJECT elements were a bit frustrating, but at least you don't have to interleave PAGE elements or other fun stuff. First, OwnPage is of course a reference to the page this object belongs on, simply increment until you hit the next page.

The XPOS and YPOS parameters can be filled with the same methodology outlined in Part 1. The WIDTH and HEIGHT parameter are exactly what one expects. The tricky one is POCOOR, though. I have no idea why it is the way it is. I simply noticed that it alternated width and height in the following pattern in my files:

$po="0 0 0 0 $w 0 $w 0 $w 0 $w 0 $w $h $w $h $w $h $w $h 0 $h 0 $h 0 $h 0 $w 0 0 0 0";

Afterword

So, was that even a marginally sensible approach? Maybe not. For my use case I just know this worked as a drop-in replacement to the system in the first part. In the future I might attempt to use OpenDocument instead of SLA (as a very speculative third instance of this series) and in turn use OpenOffice Writer rather than Scribus for this.

I do think that complex workflows are still very much a weakness of many open source web-based collaboration tools and CMS. An example of that is OpenAtrium, you can have a private group, collaboratively author and edit documents, it's lovely to work with but to export the document(s) you are left with copy and paste (my feature request on that). Of course, just having export options will not fix my very specific examples here but I think that it could have simplified things a lot.

Create Reports with Scribus (Part 1)

Posted from: 
St.Gallen

Why?

Taking content from a web application and transferring it onto paper can be a difficult problem. At least if the user wants to modify things before they go to the printer, especially on a Linux desktop that wouldn't run Adobe's Creative Suite natively, even if you wanted to spend the money on it. To show the solutions I came up with I want to look at two specific use-cases:

  1. Filling out a standard template, such as a coversheet.
  2. Printing barcode labels on common sets of labels such as 5160.

In this first part I'm assuming that direct access to an SQL database is possible, this might be the case on a local intranet. The next post will focus on options within a common CMS. The situation now in principle looks as follows for both use-cases:

Data in MySQL → ??? → Standardized document ready for print.

Scribus includes a scripting plugin that makes it possible to automate tasks with Python. While an upcoming version promises much improved scripting support, this tutorial will deal with the possibilities in the current 1.3.5.x releases.

Common functionality

Before we start to mess around with the internals of the document we want, we first have to introduce a preamble for the database connection and similar things. Make sure that you have the relevant Python modules installed. For instance, python-mysqldb (as it is known under Debian/Ubuntu) is required for this example. A general outline might look as follows:

import scribus
import MySQLdb scribus.openDoc("yourdocument.sla")
db = MySQLdb.connect(host="dbhost" , user="youruser", passwd="yourpw", db="yourdb") 
cursor = db.cursor()

While those blocks above should just work, it's a lot nicer to get some feedback where the script failed, should one condition not be met. Feel free to use this preamble which tries each statement and pops up useful exceptions. Put it all in a def block and you can reuse it for all your Scribus scripts.

After that, you would only need to run your SQL query against the cursor. This will retrieve the result set you will use a little later on to customize your document:

cursor.execute (cmd)

A few other useful ones:

cursor.fetchone() // fetch one row, easy to iterate through
cursor.rowcount() // number of rows in result set
cursor.close() //self-explanatory like the next one
db.close()

Coversheet

If you have a limited number of fields and want to quickly automate an exisiting Scribus document all you need is the names of the objects (i.e. text frames) that you want to show. This one-liner does all the work for you:

scribus.setText("your text","object name")

If you don't know an object name, just select it in Scribus and under Properties it's the first field under X,Y,Z. In most cases though you will have to provide your script with some sort of input to use on the database, here valueDialog is what you need. So an example where you need to input a string (instead of an integer, where the str isn't needed), you would use:

choice=str(scribus.valueDialog('Popup Title','Please enter the item for which you want a coversheet'))

Warning: You need to make sure you properly escape your string/integer/etc for execute(), think %s, otherwise you will meet Little Bobby Tables sooner or later.

Barcode labels

When we look at something like a list of barcodes that we want to print on label sheets, normally we have no clue about the exact number of labels; it's definitely not 1, such as in the example above. Logically, just creating a complete Scribus document and filling in the text for placeholders won't work, we have to create the fields dynamically.

Before we can create the fields though, we first have to figure out the pages, let's assume there are 3 columns of labels with 10 rows each. The following will shows us how many pages we are actually working on. It gives us the number of pages, makes it one if it's less than 30 and adds another if a fraction of 30 is involved otherwise.

num=cursor.rowcount / 30
numr=cursor.rowcount % 30
if num == 0:
    num=1
elif numr != 0:
    num=num+1

The following code will start our iteration and add a consecutive page to the end of the document, each time it loops until we are on the final page. Notice that we not only have to create a page but also make it the active one by using gotoPage. The variables xnull and ynull are starting points on the page from which we can calculate the labels.

i=0
for f in range(1,num+1):
    if f > 1:
	scribus.newPage(-1,"Master")
    scribus.gotoPage(f)
    xbase=xnull
    ybase=ynull

They way our loops are set up the script will run through each row for three fields and then go to the next row, resetting the x value. While it makes sense to use a variable such as w to move the starting point of the label as necessary, using variables for the createText is often not worth the effort, especially with half a dozen fields, offset by a slight margin.

      for j in range(10):
	xbase=xnull
	for k in range(3):
	    row=cursor.fetchone()
	    if row:
		title=scribus.createText(xbase+1.34, ybase+.4, 1.15 , .45)
		i=i+1
		xbase=xbase+w
	ybase=ybase+h

Since we did not rely on already present frames, we will often need to set particular options for these fields as well, below is an example for our title frame. To actually produce barcodes on these labels, at least for trivial ones like Code39, we can just set a matching font here.

scribus.setFont("DejaVu Sans Mono", title)
scribus.setTextAlignment(1, title)
scribus.setFontSize(8, title)

Scribus can be rather slow when creating a few hundred frames (thousands work but you can get a coffee while it does) so the progress meter is a nice addition, simply add scribus.progressTotal(cursor.rowcount) before your main loop begins and invoke scribus.progressSet(i) each time you finished a label. I also used scribus.setRedraw(False) to improve performance but I'm not sure it still makes much of a difference.

In the next part we will take a look at what you can still use Scribus for if you rely on a common CMS and cannot just access MySQL to get data and thus also cannot use Scripter.

Syndicate content