I have done some document creation with FME.
The MSWordStyler and Writer can start with a template and use its styling, its main limitation is the writer can only append to an existing document. It can’t replace values within a template. If all you require is a simple table with styles, or combinations of text tables and images, then it should do the trick, and you can recreate the table or whole document after any attribute change. It’s better to stick with something simple if that’s all you need.
If you need content type which is not supported by the WordStyler, then python is the best (only?) option. The WordStyler limitations I hit were with hyperlinks, in-document references, and replicating the required closing page for the document.
https://stackoverflow.com/questions/56374018/how-to-search-and-replace-a-word-text-in-word-document-using-python-docx This was a useful resource for going down the python route. Basically the idea is you set up ‘parameters’ in your template document, define a python function to replace text or images within a document, and then just:
doc = docx.Document(inputfilename)
docx_find_replace_text(doc, feature.getAttribute('parameter1_name'), feature.getAttribute('parameter1_value'))
docx_find_replace_text(doc, feature.getAttribute('parameter2_name'), feature.getAttribute('parameter2_value'))
docx_find_replace_image(doc,alt_text1,new_image_path1,image_width_cm1)
docx_find_replace_image(doc,alt_text2,new_image_path2,image_width_cm2)
doc.save(outputfilename)
In my experience, I’ve always ended up with a combination of fme and python/docx to create/manipulate any word documents with even slight complexity, i.e. images/hyperlinks.
MSWordStyler how to include picture in table? | Community
Does this have to be in Word? If an HTML page is also ok, you could try the HTMLReportGenerator.
That said: a docx is in the basis not much more than an xml document (see for yourself: rename the extension from .docx to .zip, then unzip...). So you could go down that route and modify the underlying xml, then re-zip everything to a docx. But that requires some reverse engineering.
@ctredinnick Got locked out of my hawaiialex account! But thank you for the response! I only require it to input are attributes and images that will be attached within the EGDB, but not hyperlinks luckily. The only thing is I’ll need to create the table in a different style than the generic stacked tables and have it be
Header1 | Header2 |
---|
Attribute1 | Values |
Attribute2 | Values |
Attribute3 | Values |
I was digging around and found that I could play around with the AttributeExploder before the MSWordStyler to achieve this. I’ll work on a template file in the meantime and experiment with this simpler approach before tackling the python! Thanks again!
Hello @hawaiialex / @boomer87, sorry to hear you were locked out of your community account. Firstly, you can reset your password to recover your hawaiialex account using the instructions from the FME Account FAQ (see: I forgot my FME Account password)!
Secondly, I am in agreeance, I think python may be the best route. If I recall correctly, FME's MS Word Writer leverages base files opposed to template files. The writer does not delete or modify the base file in any way. Instead you can apply styles to the base file which is explained in this article.
As an alternative to python, you could try working with the XML files contained within the docx archive directly. However, this method does require knowledge of XML in FME, which is not always the simplest (eg. unzipping the .docx, using XQueryXMLExtractor, XMLUpdater, rezipping the .docx, etc). Hope this helps, Kailin!
Hi @ctredinnick , @kailinatsafe , @ebygomm, @s.jager
Thank you so much for your responses and tips :) It does need to be in docx word so I hadn’t played around with HTML.
I’ve taken advice and experimented with python to get almost the desired results. Right now I have the template in docx format and all the relevant field values are being extracted from the GIS feature class and replacing the <<placeholder» text in the word document as intended.
However, I am just having trouble getting the <<DESCRIPTION» and <<PROJECT_NAME» to be replaced with the relevant values from the feature class in the header of the docx. It is still showing up as the placeholders. Are there any tips to ensure the header/footer info is updated? The header should have the description and project name while the footer should have the date in DD MONTH YYYY format.
import arcpy
import docx
# Define the paths
inputfilename = r"C:\FieldSummary_TEMPLATE.docx"
outputfilename = r"C:\FieldSummary_TEMPLATE_SAVED.docx"
feature_class = r"C:\pathwaytowork.gdb\featureclassname"
# Function to replace text, handling multiple runs
def docx_find_replace_text(doc, old_text, new_text):
for paragraph in doc.paragraphs:
if old_text in paragraph.text:
inline = paragraph.runs
for i in range(len(inline)):
if old_text in inline[i].text:
inline[i].text = inline[i].text.replace(old_text, new_text)
for table in doc.tables:
for row in table.rows:
for cell in row.cells:
for paragraph in cell.paragraphs:
if old_text in paragraph.text:
inline = paragraph.runs
for i in range(len(inline)):
if old_text in inline[i].text:
inline[i].text = inline[i].text.replace(old_text, new_text)
# Function to replace text in headers
def docx_find_replace_header(doc, old_text, new_text):
for section in doc.sections:
header = section.header
for paragraph in header.paragraphs:
if old_text in paragraph.text:
inline = paragraph.runs
for i in range(len(inline)):
if old_text in inline[i].text:
inline[i].text = inline[i].text.replace(old_text, new_text)
# Load the Word document
doc = docx.Document(inputfilename)
# Define the mapping between placeholders and feature class fields
field_mapping = {
"<<PROJECT_NAME>>": "PROJECT_NAME",
"<<FIELD_DATE>>": "FIELD_DATE",
"<<CREW>>": "CREW",
"<<PERMIT>>": "PERMIT",
"<<DIVISION>>": "DIVISION",
"<<DESCRIPTION>>": "DESCRIPTION",
"<<DATE>>": "DATE",
"<<DIST_REQ_1>>": "DIST_REQ_1",
"<<DIST_REQ_2>>": "DIST_REQ_2",
"<<HISTORY>>": "HISTORY",
"<<SUB_OB_1>>": "SUB_OB_1",
"<<SUB_OB_2>>": "SUB_OB_2",
"<<ARCH_OB>>": "ARCH_OB",
"<<REC>>": "REC"
}
# Get the feature class fields (values from the mapping)
feature_fields = list(field_mapping.values())
# Iterate through the feature class
with arcpy.da.SearchCursor(feature_class, feature_fields) as cursor:
for feature in cursor:
for placeholder, field in field_mapping.items():
# Find the index of the field in the feature_fields list
field_index = feature_fields.index(field)
# Get the corresponding value from the feature
value = feature[field_index]
# Replace the placeholder with the value
docx_find_replace_text(doc, placeholder, str(value))
# Replace the placeholder in the header for PROJECT_NAME and DESCRIPTION
if placeholder in ["<<PROJECT_NAME>", "<<DESCRIPTION>>"]:
docx_find_replace_header(doc, placeholder, str(value))
# Save the updated document
doc.save(outputfilename)