Skip to main content
FME Hub user andreas_h just uploaded a new transformer to the FME Hub.

URLResolver

Overview

The URLResolver is an Custom Transformer that resolves relative URLs against a base URL with input validation. It ensures base URLs are absolute and relative URLs are relative before combining them using urllib.parse.urljoin,

Key Features

URL validation (base_url must be absolute, relative_url must be relative)

RFC 3986 compliant URL resolution using urllib.parse.urljoin

Handles all relative URL types (root-relative, parent-relative, document-relative, query-only, fragment-only)

Automatic path navigation resolution (.. and . segments)

Error reporting with detailed messages

Input Attributes

The transformer expects the following attributes on input features:

Attribute

Required

Validation

Description

Example

_base_url

Yes

Must be absolute with scheme

The base URL (must include protocol)

https://example.com/path/page.html

_relative_url

Yes

Must be relative without scheme

The relative URL to resolve

../other.html, /path/file.json, ?query=value

Output Attributes

The transformer adds the following attributes to output features:

Attribute

Type

Description

_resolved_url

String

The absolute resolved URL

_url_error

String

Error message if resolution failed

URL Validation

base_url - Must be an absolute URL with a scheme:

Allowed: https://example.com, http://localhost:8080/path/, ftp://server.com/file.txt

Allowed: URLs with paths: https://example.com/folder/page.html

Allowed: URLs with query strings: https://example.com/page?param=value

NOT allowed: Relative URLs such as example.com, //example.com, /path/page

relative_url - Must be a relative URL without a scheme:

Allowed: Document-relative: page.html, folder/file.json

Allowed: Parent-relative: ../page.html, ../../data/file.json

Allowed: Root-relative: /path/page, /api/data.json

Allowed: Query-only: ?param=value&other=test

Allowed: Fragment-only: #section1

Allowed: Protocol-relative: //cdn.example.com/script.js

NOT allowed: Absolute URLs like https://other.com/page, http://example.com

URL Resolution Examples

Document-Relative URLs

Base URL

Relative URL

Resolved URL

https://example.com/docs/guide.html

page.html

https://example.com/docs/page.html

https://example.com/docs/guide.html

images/logo.png

https://example.com/docs/images/logo.png

Parent-Relative URLs

Base URL

Relative URL

Resolved URL

https://example.com/docs/api/guide.html

../index.html

https://example.com/docs/index.html

https://example.com/docs/api/guide.html

../../home.html

https://example.com/home.html

Root-Relative URLs

Base URL

Relative URL

Resolved URL

https://example.com/docs/guide.html

/api/data.json

https://example.com/api/data.json

https://example.com/docs/guide.html

/

https://example.com/

Query and Fragment URLs

Base URL

Relative URL

Resolved URL

https://example.com/page.html

?search=test

https://example.com/page.html?search=test

https://example.com/page.html?old=value

?new=value

https://example.com/page.html?new=value

https://example.com/page.html

#section1

https://example.com/page.html#section1

Protocol-Relative URLs

Base URL

Relative URL

Resolved URL

https://example.com/page.html

//cdn.example.com/script.js

https://cdn.example.com/script.js

http://example.com/page.html

//cdn.example.com/script.js

http://cdn.example.com/script.js

Invalid Input Examples

Base URL

Relative URL

Result

Error Message

example.com/page

other.html

Error

"base_url must be an absolute URL with a scheme..."

//example.com/page

other.html

Error

"base_url must be an absolute URL with a scheme..."

https://example.com

https://other.com/page

Error

"relative_url must be a relative URL without a scheme..."

(empty)

page.html

Error

"Missing or empty _base_url attribute"

https://example.com

(empty)

Error

"Missing or empty _relative_url attribute"



Would you like to know more? Click here to find out more details!
Be the first to reply!