Question

recursively concatenate json attributes

7 years ago
June 14, 2018
4 replies
178 views

ottadini
Contributor
26 replies

I have a response from Azure's OCR service, which looks like this (full response attached):

{
    "language": "en",
    "orientation": "Up",
    "textAngle": 0,
    "regions": [
        {
            "boundingBox": "316,555,1597,123",
            "lines": [
                {
                    "boundingBox": "1515,555,398,29",
                    "words": [
                        {
                            "boundingBox": "1515,555,82,23",
                            "text": "CRMA"
                        },
                        {
                            "boundingBox": "1608,555,154,29",
                            "text": "Completion"
                        },
                        {
                            "boundingBox": "1775,555,138,23",
                            "text": "Document"
                        }
                    ]
                },
                {
                    "boundingBox": "316,632,556,46",
                    "words": [
                        {
                            "boundingBox": "316,632,233,46",
                            "text": "As-built"
                        },
                        {
                            "boundingBox": "570,632,302,46",
                            "text": "Certificate"
                        }
                    ]
                }
            ]
        },

and so on.

I wish to concatenate the text of "words" with spaces, within "lines" separated by a single newline, within "regions" separated by 2 newlines, into a single attribute, so the above snippet would look like

CRMA Completion Document
As-built Certificate

I'm new to the json transformers and seem to be going around in circles with this. How can I do it?

At the moment I have three JSONFragmenters chained together, then I have three Aggregators chained together (see ) It seems a bit awkward.

+19

daveatsafe
Safer
1632 replies
7 years ago
June 14, 2018

Hi @ottadini,

I have modified your workspace to use a single JSONFragmenter that flattens the fragment into a list attribute. We can then use FME's list manipulation transformers to rebuild your document.

The modified workspace actually has a few more transformers, but it does preserve the order of the regions and lines when rebuilding.

m-azure-json-to-text.fmw

ottadini
Author
Contributor
26 replies
7 years ago
June 15, 2018

daveatsafe wrote:

Hi @ottadini,

I have modified your workspace to use a single JSONFragmenter that flattens the fragment into a list attribute. We can then use FME's list manipulation transformers to rebuild your document.

The modified workspace actually has a few more transformers, but it does preserve the order of the regions and lines when rebuilding.

m-azure-json-to-text.fmw

This is much nicer, thanks Dave! The sorting issue was something I was wrestling with, and ended up with 3 of them at one stage.

takashi
7708 replies
7 years ago
June 15, 2018

daveatsafe wrote:

Hi @ottadini,

I have modified your workspace to use a single JSONFragmenter that flattens the fragment into a list attribute. We can then use FME's list manipulation transformers to rebuild your document.

The modified workspace actually has a few more transformers, but it does preserve the order of the regions and lines when rebuilding.

m-azure-json-to-text.fmw

Probably you can remove the Counter and the Sorter from @DaveAtSafe's solution, if you set "json_index" (which is given by the JSONFragmenter) to the "Group By" parameter and also set "Yes" to the "Input is Ordered by Group" parameter in the Aggregator.

---------

FME bundles Zorba to execute XQuery expressions, and Zorba supports JSONiq extension which allows you to manipulate JSON documents through XQuery expressions. So, in an FME workspace, you can use the XMLXQeuryExtractor to execute XQuery expression including JSONiq syntax.

Your question can also be solved with a short XQeury expression.

----------

XQuery Expression:

[Edit] "
" is the reference to newline character. See also here. XQuery/Special Characters

let $doc := fme:get-json-attribute("azure_response")
let $regions := {
    for $r in jn:members($doc("regions"))
    let $lines := {
        for $ln in jn:members($r("lines"))
        return fn:string-join(jn:members($ln("words"))("text"), " ")
    }
    return fn:string-join($lines, "&#10;")
}
return fn:string-join($regions, "&#10;&#10;")

----------

Surprisingly, the JSONTemplater or the XMLTemplater (Write XML Header: No) can also be used to execute the same expression.

ottadini
Author
Contributor
26 replies
7 years ago
June 16, 2018

daveatsafe wrote:

Hi @ottadini,

I have modified your workspace to use a single JSONFragmenter that flattens the fragment into a list attribute. We can then use FME's list manipulation transformers to rebuild your document.

The modified workspace actually has a few more transformers, but it does preserve the order of the regions and lines when rebuilding.

m-azure-json-to-text.fmw

@takashi -- thank you! I imagine that those changes will speed up the process a little.

Very impressive to again see the breadth of your knowledge! JSONiq seems very powerful, and something I could use in standalone python as well.

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

recursively concatenate json attributes

4 replies

Reply

Helpful Members This Week

Recently Solved Questions

Create date segments of two table with overlap of times

Automate Fanout of columns/splitting attributes to different output by attribute name

Tracing Multiple Networks from Sources to Valves Without Python

FME Flow version control how to use different branch

Parameters within group parameters not available in a webhook?

Community Stats

Latest FME

Cookie policy

Cookie settings

Reply

Related Topics

Extract muliple values from elements in xml, nested in an attributeicon

XMLXQueryExtractor from samplericon

Learning xfmap - problems with nested lists...icon

XMLUpdater - Conditional XQuery to keep existing element values if update attributes are emptyicon

Extract attributes from XML attributeicon

Helpful Members This Week

Recently Solved Questions

Create date segments of two table with overlap of times

Automate Fanout of columns/splitting attributes to different output by attribute name

Tracing Multiple Networks from Sources to Valves Without Python

FME Flow version control how to use different branch

Parameters within group parameters not available in a webhook?

Popular Tags

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings