Extraction of embedded inline images in InfoPath 2010

At work, we needed to extract images from rich text boxes in a InfoPath 2010 form. These image files are stored to a temporary disk area and used to create a HTML preview of a document based on the XML content in the form, and also for uploading those images to a SharePoint image library, separated from their source document. In InfoPath 2010, all images are stored inline on the img-elements, in base64 encoding in the attribute xd:inline.

Our forms requires the InfoPath filler application, not web forms hosted by SharePoint. I don't think that will work. The code in the example FormCode.cs below needs full trust, since it writes to the hard disk. It's not very useful as it is written here, but it is just to illustrate the extraction mechanism.

If you upload the images to a SharePoint library like we do, you also have to add a src-attribute in the img-element, so that it points to its new address.

I don't include the form in this post, but it's simple: a rich text box with embedded images enabled, connected to one field, and a button with the id BTN_EXTRACT_IMAGES.

If images are of a more photographic nature than our images, then perhaps JPG is a better storage format than PNG.

(double click the code to copy it)


Upgrading Infopath forms and version on Sharepoint

This is an off-topic log entry, but I guess most .net developers also are, or will be, familiar with SharePoint, whether they want it or not. We are currently working on migrating from a set of documents (InfoPath form xml files) edited on a SP 2003 server with a locally installed Infopath 2007 form template to a set of documents in a SharePoint 2010 Form library where the template will be stored inside the form library.

We tried a number of different approaches, but it failed when it came to how to tell the old documents that they should use the new form in the library. An entry on the InfoPath blog gave a lot of valuable information. The re-linking alternative didn't work as expected, so we turned to the the PIFix-alternative. That was a success.

I'm not sure why re-linking failed, my guess is that it failed because our old documents were using an installed schema, so they weren't pointing to any template URL. Perhaps SharePoint should support that scenario? It might also be unrelated, but I didn't spend more time finding out the cause.

To migrate the forms, we have done the following steps:
  1. Use InfoPath 2010 to open the 2007 .xsn file and convert it.
  2. Update the forms submit options so that when user submits, it ends up in the new library.
  3. Set versioning information the way we want it (we use a yyyy.m.d.n scheme to make it simple).
  4. Publish the form to the new forms library.
  5. Create a "dummy" test file with the "+ Add document" button in the forms library.
  6. Open that dummy file with a notepad application and note the version and product version info as well as the url to the template inside the forms library (it ends with /Forms/template.xsn)
  7. Map a drive on our computer to the form library.
  8. Install InfoPath 2003 sdk, you can find the link in the blog article referenced above.
  9. Open a command prompt and go to the forms library.
  10. Run PIFix tool from InfoPath 2003 SDK as described below.
And voila, the forms are now using the form template for the folder.

Here is a detailed example on how to run PIFix. /v is for the template version and /prv is for product version, the lowest version of InfoPath you support for this form. For InfoPath 2010, this value is 14.0.0, if you set it, you cannot open it in an old version of InfoPath anymore. The /url parameter sets where the document will look for its template. All these values can be extracted from the dummy-file we created in the steps above. I've used Windows Explorer to map \\sharepointserver\sites\mysite\my form library\ to Z: before I start doing anything on the command prompt.


Changing Windows hosts file with Powershell

I've barely started looking into Powershell, but in in my mind, every developer needs to know something about it, at least if you use cmd.exe for anything today. I've already run into some problems, when I tried to script changes to the hosts file on my computer.

It seems that Windows 7 (at least my x64 installation with Norwegian regional settings) does not tolerate Powershell to create a hosts-file. I've tried with -encoding ASCII and other encodings, but Windows just ignores it, until I create a new one with Notepad.

The solution seems to use Clear-Content on the file and then append text to the end like this non-modifying code-example show. You typically want to do some search and replace or something else to change $content into $newcontent. You shouldn't run it without backing up the hosts file first.

(doubleclick the code to copy it with ctrl-c)

I found this way of doing things after a lengthy search on the net, in a piece of source code by Mark Embling at github.

I'm still not sure why Windows doesn't like hosts-file written directly by Powershell. I've looked at one working and one non-working file in a hex editor without seeing any differences (like byte order marks), but I'm sure there is an explanation somewhere. I guess this solution leaves the file almost like it was, without touching any attributes.