Recently we inherited a new project where FileMaker was being used to manage the content for a website, with an ODBC connection updating the MySQL database behind the site.
The previous developers were using numeric IDs to link from page to page and category to category to sub-category to company etc. While this was certainly efficient it resulted in URLs like
https://www.site.com/16/220/1472
The client wasn’t very happy with this solution, if for no other reason than it made looking at analytics for the site really difficult because they couldn’t tell at a glance what category 16 was, or sub category 220, or who company 1472 were!
Both the client and their SEO consultant wanted those numbers replaced with a URL which described the content – a ‘slug’ as it’s often referred to.
Clearly URLs can only safely contain certain characters, if a ? or an & turned up in the URL it could change the meaning of the URL to the web server, and non-standard characters also pose other issues (things like ä or ę or any of the other high ASCII character).
We came up with a FileMaker custom function we call CreateWebSafeSlug
which takes a single parameter text
.
Substitute (
Filter (
Lower (
Substitute (
ReplaceAccents(
TrimAll ( text ; 0 ; 0 )
) ; [ "&" ; "and" ] ; [ " " ; "-" ] ; [ "." ; "-" ] )
) ;
"abcdefghijklmnopqrstuvwxyz1234567890-"
) ; ["----" ; "-"] ; ["---" ; "-"] ; ["--" ; "-" ]
)
As you can see we’re doing a number of things to get the output we wanted (from the ‘inside’ out):
- trimming all the whitespace from either end of the text
- replacing all ‘special’ characters with their nearest ‘regular’ character using another custom function
ReplaceAccents
which is below - replacing ‘&’ with ‘and’, and both a space and a full-stop (period) with a hyphen
- converting everything to lowercase
- filtering out anything that might still be in there which we don’t want
- doing some tidying up to remove multiple consecutive hyphens which can come about from dirty data, existing hyphens (e.g. {space}-{space} having been in the text) etc
All this means that if we had the text
François & Chloé visit the Musée du Louvre
We’d end up with
francois-and-chloe-visit-the-musee-du-louvre
Which turns ‘risky’ text into something which all browsers, search engines, and users find more useful than ‘1472’!
The ReplaceAccents
custom function also takes a single text
parameter and looks like this.
Substitute(text;
["²";"2"];
["³";"3"];
["¹";"1"];
["¼";"1/4"];
["½";"1/2"];
["¾";"3/4"];
["À";"A"];
["Á";"A"];
["Â";"A"];
["Ã";"A"];
["Ä";"A"];
["Å";"A"];
["Æ";"AE"];
["Ç";"C"];
["È";"E"];
["É";"E"];
["Ê";"E"];
["Ë";"E"];
["Ì";"I"];
["Í";"I"];
["Î";"I"];
["Ï";"I"];
["Ð";"D"];
["Ñ";"N"];
["Ò";"O"];
["Ó";"O"];
["Ô";"O"];
["Õ";"O"];
["Ö";"O"];
["Ø";"O"];
["Ù";"U"];
["Ú";"U"];
["Û";"U"];
["Ü";"U"];
["Ý";"Y"];
["Þ";"P"];
["ß";"B"];
["à";"a"];
["á";"a"];
["â";"a"];
["ã";"a"];
["ä";"a"];
["å";"a"];
["æ";"ae"];
["ç";"c"];
["è";"e"];
["é";"e"];
["ê";"e"];
["ë";"e"];
["ì";"i"];
["í";"i"];
["î";"i"];
["ï";"i"];
["ð";"d"];
["ñ";"n"];
["ò";"o"];
["ó";"o"];
["ô";"o"];
["õ";"o"];
["ö";"o"];
["ø";"o"];
["ù";"u"];
["ú";"u"];
["û";"u"];
["ü";"u"];
["ý";"y"];
["þ";"p"];
["ÿ";"y"];
["ƒ";"f"];
["…";"..."];
["′";"'"];
["″";"''"];
["⁄";"/"];
["℘";"P"];
["ℑ";"I"];
["ℜ";"R"];
["™";"(R)"];
["−";"-"];
["∗";"*"];
["≡";"="];
["Œ";"OE"];
["œ";"oe"];
["Š";"S"];
["š";"s"];
["Ÿ";"Y"];
["ˆ";"^"];
["˜";"~"];
[" ";" "];
[" ";" "];
[" ";" "];
["–";"-"];
["—";"-"];
["‘";"'"];
["’";"'"];
["‚";","];
["\“";"\""];
["\”";"\""];
["\„";",,"];
["‹";"'"];
["›";"'"]
)
As you can see it’s not only ‘accents’ which are being converted into low ASCII characters, but lots of other things which over time have turned up – usually pasted from word – in the content.
This function particularly feels like a ‘hack’, so we’d love to hear from you if you’ve solved this problem in another way!