{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Generate an ATOM entry for the catalog and perform its ingestion\n", "\n", "In this scenario, we generate an ATOM entry containing the metadata of a product and feed it into the catalog (we use a sample product downloaded from a public repository).\n", "\n", "## 1. Set the necessary variables\n", "\n", "The following section defines all the necessary information as variables so the code below can be easily reused." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import getpass\n", "\n", "# Set the credentials (Ellip username and API key)\n", "username = raw_input(\"What is your Ellip username? \")\n", "api_key = getpass.getpass(\"What is your Ellip API key? \")\n", "\n", "# Set the name of the destination index on the Terradue catalog\n", "index_name = raw_input(\"What is the destination index name? (press Enter to confirm default [{0}]) \".format(username))\n", "\n", "if not index_name:\n", " index_name = username\n", "\n", "# Set the catalog endpoint URL\n", "endpoint = \"https://catalog.terradue.com/{0}\".format(index_name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Define a function to generate an *EarthObservation* extension element" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import lxml.etree as etree\n", "import numpy as np\n", "from shapely.wkt import loads\n", "\n", "def eop_metadata(metadata):\n", " \n", " # Define namespace URIs\n", " opt = 'http://www.opengis.net/opt/2.1'\n", " om = 'http://www.opengis.net/om/2.0'\n", " gml = 'http://www.opengis.net/gml/3.2'\n", " eop = 'http://www.opengis.net/eop/2.1'\n", " sar = 'http://www.opengis.net/sar/2.1'\n", " \n", " # Define the element structure\n", " # There are several levels for much of the content and many elements are only containers for other elements;\n", " # elements that hold actual values are marked as 'content element' below.\n", " root = etree.Element('{%s}EarthObservation' % opt)\n", "\n", " phenomenon_time = etree.SubElement(root, '{%s}phenomenonTime' % om)\n", " time_period = etree.SubElement(phenomenon_time, '{%s}TimePeriod' % gml)\n", " # Content element:\n", " begin_position = etree.SubElement(time_period, '{%s}beginPosition' % gml)\n", " # Content element:\n", " end_position = etree.SubElement(time_period, '{%s}endPosition' % gml)\n", "\n", " procedure = etree.SubElement(root, '{%s}procedure' % om)\n", " earth_observation_equipment = etree.SubElement(procedure, '{%s}EarthObservationEquipment' % eop)\n", " acquisition = etree.SubElement(earth_observation_equipment, '{%s}acquisitionParameters' % eop)\n", " # Content element:\n", " orbit_number = etree.SubElement(acquisition, '{%s}orbitNumber' % eop)\n", " # Content element:\n", " wrs_longitude_grid = etree.SubElement(acquisition, '{%s}wrsLongitudeGrid' % eop)\n", " # Content element:\n", " orbit_direction = etree.SubElement(acquisition, '{%s}orbitDirection' % eop)\n", "\n", " feature_of_interest = etree.SubElement(root, '{%s}featureOfInterest' % om)\n", " footprint = etree.SubElement(feature_of_interest, '{%s}Footprint' % eop)\n", " multi_extentOf = etree.SubElement(footprint, '{%s}multiExtentOf' % eop)\n", " multi_surface = etree.SubElement(multi_extentOf, '{%s}MultiSurface' % gml)\n", " surface_members = etree.SubElement(multi_surface, '{%s}surfaceMembers' % gml)\n", " polygon = etree.SubElement(surface_members, '{%s}Polygon' % gml) \n", " exterior = etree.SubElement(polygon, '{%s}exterior' % gml) \n", " linear_ring = etree.SubElement(exterior, '{%s}LinearRing' % gml) \n", " # Content element:\n", " poslist = etree.SubElement(linear_ring, '{%s}posList' % gml) \n", "\n", " result = etree.SubElement(root, '{%s}result' % om)\n", " earth_observation_result = etree.SubElement(result, '{%s}EarthObservationResult' % opt)\n", " # Content element:\n", " cloud_cover_percentage = etree.SubElement(earth_observation_result, '{%s}cloudCoverPercentage' % opt)\n", " \n", " metadata_property = etree.SubElement(root, '{%s}metaDataProperty' % eop)\n", " earth_observation_metadata = etree.SubElement(metadata_property, '{%s}EarthObservationMetaData' % eop)\n", " # Content element:\n", " identifier = etree.SubElement(earth_observation_metadata, '{%s}identifier' % eop)\n", " # Content element:\n", " product_type = etree.SubElement(earth_observation_metadata, '{%s}productType' % eop)\n", " \n", " vendor_specific = etree.SubElement(earth_observation_metadata, '{%s}vendorSpecific' % eop)\n", " specific_information = etree.SubElement(vendor_specific, '{%s}SpecificInformation' % eop)\n", " # Content element:\n", " local_attribute = etree.SubElement(specific_information, '{%s}localAttribute' % eop)\n", " # Content element:\n", " local_value = etree.SubElement(specific_information, '{%s}localValue' % eop)\n", " \n", " # Set values for content elements\n", " begin_position.text = metadata['startdate']\n", " end_position.text = metadata['enddate']\n", " orbit_number.text = metadata['orbitNumber']\n", " wrs_longitude_grid.text = metadata['wrsLongitudeGrid']\n", " orbit_direction.text = metadata['orbitDirection']\n", " \n", " coords = np.asarray([t[::-1] for t in list(loads(metadata['wkt']).exterior.coords)]).tolist()\n", " pos_list = ''\n", " for elem in coords:\n", " pos_list += ' '.join(str(e) for e in elem) + ' ' \n", "\n", " poslist.attrib['count'] = str(len(coords))\n", " poslist.text = pos_list\n", " \n", " cloud_cover_percentage.text = metadata['cc']\n", " \n", " identifier.text = metadata['identifier']\n", " product_type.text = metadata['productType']\n", " \n", " local_attribute.text = 'MY_ATTRIBUTE'\n", " local_value.text = metadata['MY_ATTRIBUTE']\n", " \n", " return root" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Define a class for ATOM manipulation\n", "\n", "This class allows us to add basic elements such as an identifier, a title, an enclosure link and a product date; and to append the ``EarthObservation`` extension created above." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import lxml.etree as etree\n", "import sys\n", "import os\n", "import string\n", "import hashlib\n", "import urllib2\n", "import base64\n", "import time\n", "\n", "class Atom:\n", " tree = None\n", " root = None\n", " entry = None\n", " \n", " def __init__(self, root):\n", " self.root = root\n", " self.tree = root\n", " self.links = self.root.xpath('/a:feed/a:entry/a:link', namespaces={'a':'http://www.w3.org/2005/Atom'})\n", " entries = self.root.xpath('/a:feed/a:entry', namespaces={'a':'http://www.w3.org/2005/Atom'})\n", " if len(entries) > 0:\n", " self.entry = entries[0]\n", " \n", " @staticmethod\n", " def from_template():\n", " template = \"\"\"\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\"\"\"\n", " tree = etree.fromstring(template)\n", " return Atom(tree)\n", " \n", "\n", " @staticmethod\n", " def load(url, username=None, api_key=None):\n", " \"\"\"Load and return the atom file at the location url\n", " \"\"\"\n", " \n", " request = urllib2.Request(url)\n", " \n", " if ( username != None ):\n", " base64string = base64.b64encode('%s:%s' % (username, api_key))\n", " request.add_header(\"Authorization\", \"Basic %s\" % base64string) \n", " fp = urllib2.urlopen(request)\n", " tree = etree.parse(fp)\n", " fp.close()\n", " if ( tree.getroot().tag != \"{http://www.w3.org/2005/Atom}feed\" ):\n", " raise ValueError('not an Atom feed')\n", "\n", " return Atom(tree)\n", "\n", "\n", " def set_identifier(self, identifier):\n", " \"\"\"Set first atom entry identifier\n", " \"\"\"\n", " \n", " el_identifier = self.root.xpath('/a:feed/a:entry/d:identifier', \n", " namespaces={'a':'http://www.w3.org/2005/Atom',\n", " 'd':'http://purl.org/dc/elements/1.1/'})\n", " \n", " el_identifier[0].text = identifier\n", " \n", " def get_identifier(self):\n", " el_identifier = self.root.xpath('/a:feed/a:entry/d:identifier', \n", " namespaces={'a':'http://www.w3.org/2005/Atom',\n", " 'd':'http://purl.org/dc/elements/1.1/'})\n", " \n", " if (len(el_identifier) == 0):\n", " return None\n", " \n", " return el_identifier[0].text;\n", " \n", " def get_total_results(self, create=False):\n", " # get OS total results in feed\n", " totalResults = self.root.xpath('/a:feed/os:totalResults', namespaces={'a':'http://www.w3.org/2005/Atom', 'os':'http://a9.com/-/spec/opensearch/1.1/'})\n", " \n", " if (len(totalResults) == 0):\n", " return None\n", " \n", " return int(totalResults[0].text)\n", " \n", " def get_title(self, create=False):\n", " # get or create title\n", " titles = self.root.xpath('/a:feed/a:entry/a:title', namespaces={'a':'http://www.w3.org/2005/Atom'})\n", " \n", " if (len(titles) == 0):\n", " if (create):\n", " titles = [etree.SubElement(self.entry, \"{http://www.w3.org/2005/Atom}title\")]\n", " return titles[0]\n", " return None\n", " \n", " return titles[0]\n", " \n", " def set_title_text(self, text):\n", " \"\"\"Set first atom entry title\n", " \"\"\"\n", " \n", " el_title = self.root.xpath('/a:feed/a:entry/a:title', \n", " namespaces={'a':'http://www.w3.org/2005/Atom'})\n", " \n", " el_title[0].text = text\n", " \n", " def get_summary(self, create=False):\n", " # get or create summary\n", " summaries = self.root.xpath('/a:feed/a:entry/a:summary', namespaces={'a':'http://www.w3.org/2005/Atom'})\n", " \n", " if (len(summaries) == 0):\n", " if (create):\n", " summaries = [etree.SubElement(self.entry, \"{http://www.w3.org/2005/Atom}summary\")]\n", " return summaries[0]\n", " return None\n", " \n", " return summaries[0]\n", " \n", " def set_summary_text(self, text):\n", " # get or create summary\n", " summary = self.get_summary(True)\n", " \n", " summary.text = text\n", " \n", " def get_links(self, rel_type):\n", " # get links\n", " return self.root.xpath('/a:feed/a:entry/a:link[@rel = \"{0}\"]'.format(rel_type), namespaces={'a':'http://www.w3.org/2005/Atom'})\n", "\n", "\n", " def set_enclosure_link(self, href, title):\n", " \n", " el_enclosure_link = self.root.xpath('/a:feed/a:entry/a:link[@rel=\"enclosure\" and (@href=\"\" or @href=\"{0}\")]'.format(href), \n", " namespaces={'a':'http://www.w3.org/2005/Atom'})\n", " \n", " if (len(el_enclosure_link) > 0):\n", " link = el_enclosure_link[0]\n", " link.attrib['href'] = href\n", " else:\n", " link = self.add_enclosure_link(href, title)\n", " \n", " \n", " def add_enclosure_link(self, href, title):\n", " \n", " xml_string = '' % (title, href.replace('&', '&'))\n", " print(xml_string)\n", " \n", " link = etree.fromstring(xml_string)\n", " self.entry.append(link)\n", " \n", " return link\n", " \n", "\n", " def add_extension(self, xml_ext):\n", " \n", " el_entry = self.root.xpath('/a:feed/a:entry/a:link', \n", " namespaces={'a':'http://www.w3.org/2005/Atom'})\n", " \n", " el_entry[0].addnext(xml_ext)\n", " \n", " def add_link(self, href, rel, title=None, type=None):\n", " \n", " link = etree.SubElement(self.root.xpath('/a:feed/a:entry', \n", " namespaces={'a':'http://www.w3.org/2005/Atom'})[0], \"{http://www.w3.org/2005/Atom}link\")\n", " \n", " link.attrib['href'] = href\n", " link.attrib['rel'] = rel\n", " if title:\n", " link.attrib['title'] = title\n", " if type:\n", " link.attrib['type'] = type\n", " \n", " \n", " def remove_link(self, rel, link_title=None, link_type=None, link_url=None):\n", " links = self.get_links(rel)\n", " filter = None\n", " value = None\n", "\n", " if link_title:\n", " filter = 'title'\n", " value = link_title\n", " elif link_type:\n", " filter = 'type'\n", " value = link_type\n", " elif link_url:\n", " filter = 'url'\n", " value = link_url\n", " else:\n", " raise Exception(\"Required parameter link_title, link_type or link_url\")\n", "\n", " for link in links:\n", " if link.attrib[filter] == value:\n", " link.getparent().remove(link) \n", " \n", "\n", " \n", " def get_offering_elements(self, offering_code):\n", " \n", " return self.root.xpath('/a:feed/a:entry/b:offering[@code=\"{0}\"]'.format(offering_code), \n", " namespaces={'a':'http://www.w3.org/2005/Atom',\n", " 'b':'http://www.opengis.net/owc/1.0'})\n", " \n", " \n", " @staticmethod\n", " def get_operation_elements(offering_element, operation_code=None): \n", " \n", " xpath = 'b:operation'\n", " if (operation_code):\n", " xpath += '[@code=\"{0}\"]'.format(operation_code)\n", " return offering_element.xpath(xpath, namespaces={'b':'http://www.opengis.net/owc/1.0'})\n", " \n", " \n", " def add_offering(self, offering):\n", " \n", " self.root.xpath('/a:feed/a:entry', namespaces={'a':'http://www.w3.org/2005/Atom'})[0].append(offering)\n", " \n", " \n", " def add_offerings(self, offerings):\n", " \n", " for offering in offerings:\n", " self.add_offering(offering)\n", " \n", " \n", " def get_dctspatial(self, create=False):\n", " \n", " # get or create summary\n", " spatials = self.root.xpath('/a:feed/a:entry/c:spatial', \n", " namespaces={'a':'http://www.w3.org/2005/Atom',\n", " 'c':'http://purl.org/dc/terms/'})\n", " \n", " if (len(spatials) == 0):\n", " if (create):\n", " spatials = [etree.SubElement(self.entry, \"{http://purl.org/dc/terms/}spatial\")]\n", " return spatials[0]\n", " return None\n", " \n", " return spatials[0]\n", " \n", " def set_dctspatial(self, wkt):\n", " \n", " el_spatial = self.get_dctspatial(True)\n", " \n", " el_spatial.text = wkt\n", " \n", " def get_dcdate(self, create):\n", " \n", " # get or create dcdate\n", " el_dates = self.root.xpath('/a:feed/a:entry/d:date', \n", " namespaces={'a':'http://www.w3.org/2005/Atom',\n", " 'd':'http://purl.org/dc/elements/1.1/'})\n", " \n", " if (len(el_dates) == 0):\n", " if (create):\n", " el_dates = [etree.SubElement(self.entry, \"{http://purl.org/dc/elements/1.1/}date\")]\n", " return el_dates[0]\n", " return None\n", " \n", " return el_dates[0]\n", " \n", " def set_dcdate(self, date):\n", " \n", " # get or create dcdate\n", " dcdate = self.get_dcdate(True)\n", " \n", " dcdate.text = date\n", " \n", " \n", " def set_published(self, published):\n", " \n", " el_published = self.root.xpath('/a:feed/a:entry/a:published', \n", " namespaces={'a':'http://www.w3.org/2005/Atom'})\n", " el_published[0].text = published\n", " \n", " def get_category_by_scheme(self, scheme):\n", " \n", " categories = self.root.xpath('/a:feed/a:entry/a:category[@scheme=\"{0}\"]'.format(scheme), namespaces={'a':'http://www.w3.org/2005/Atom'}) \n", " if (len(categories) == 0):\n", " return None\n", " \n", " return categories[0]\n", " \n", " def get_categories(self, term, scheme=None):\n", " \n", " # get categories\n", " filter = '@term=\"{0}\"'.format(term)\n", " if scheme != None:\n", " filter = '{0} and @scheme=\"{1}\"'.format(filter, scheme)\n", " \n", " return self.root.xpath('/a:feed/a:entry/a:category[{0}]'.format(filter), namespaces={'a':'http://www.w3.org/2005/Atom'})\n", " \n", " def remove_category(self, term, scheme=None):\n", " \n", " # get and remove category\n", " for category in self.get_categories(term, scheme):\n", " category.getparent().remove(category)\n", " \n", " def remove_category_by_scheme(self, scheme):\n", " \n", " # get categories\n", " filter = '@scheme=\"{0}\"'.format(scheme)\n", " \n", " categories = self.root.xpath('/a:feed/a:entry/a:category[{0}]'.format(filter), namespaces={'a':'http://www.w3.org/2005/Atom'})\n", " for category in categories:\n", " category.getparent().remove(category)\n", " \n", " def set_category(self, term, label=None, scheme=None):\n", " \n", " categories = self.get_categories(term, scheme)\n", " \n", " if (len(categories) == 0):\n", " categories = [etree.SubElement(self.entry, \"{http://www.w3.org/2005/Atom}category\")]\n", " \n", " categories[0].attrib['term'] = term\n", " if label != None:\n", " categories[0].attrib['label'] = label\n", " if scheme != None:\n", " categories[0].attrib['scheme'] = scheme\n", " \n", " \n", " def set_generator(self, uri, version, text):\n", " \n", " # get or create generator\n", " el_generator = self.root.xpath('/a:feed/a:entry/a:generator', namespaces={'a':'http://www.w3.org/2005/Atom'})\n", " \n", " if (len(el_generator) == 0):\n", " el_generator = [etree.SubElement(self.root.xpath('/a:feed/a:entry', \n", " namespaces={'a':'http://www.w3.org/2005/Atom'})[0], \"{http://www.w3.org/2005/Atom}generator\")]\n", " \n", " el_generator[0].attrib['uri'] = uri\n", " el_generator[0].attrib['version'] = version\n", " el_generator[0].text = text\n", " \n", "\n", " def append_summary_html(self, text):\n", " \"\"\"Append atom summary with text\n", " \"\"\"\n", "\n", " html_summary = self.get_summary(True).text\n", " html_summary += \"

%s

\" % text\n", "\n", " self.set_summary_text(html_summary)\n", "\n", "\n", " def to_string(self, pretty_print = True):\n", " \n", " return etree.tostring(self.tree, pretty_print=pretty_print)\n", " \n", " def clear_enclosures(self):\n", " \n", " links = self.get_links(\"enclosure\")\n", " for link in links:\n", " link.getparent().remove(link) \n", " \n", " def get_extensions(self, name, namespace):\n", " \n", " return self.root.xpath('/a:feed/a:entry/e:{0}'.format(name), \n", " namespaces={'a':'http://www.w3.org/2005/Atom',\n", " 'e':namespace})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Build the *EarthObservation* extension element\n", "\n", "We define a dictionary containing the metadata and use it as argument for the function defined above." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "metadata = {'startdate': '2019-01-02T03:04:05.678Z',\n", " 'enddate': '2019-01-02T03:05:06.789Z',\n", " 'orbitNumber': '99',\n", " 'wrsLongitudeGrid':'123',\n", " 'orbitDirection': 'DESCENDING',\n", " 'wkt': 'POLYGON((10.1 10.2,20.3 10.4,20.5 20.6,10.7 20.8,10.1 10.2))',\n", " 'cc': '55',\n", " 'identifier' : 'MY_PRODUCT',\n", " 'productType': 'MY_TYPE',\n", " 'MY_ATTRIBUTE': 'MY_VALUE'\n", "}\n", "\n", "# Build the element\n", "eo = eop_metadata(metadata)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Show the `EarthObservation` element just created:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(etree.tostring(eo, pretty_print=True))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. Build an ATOM feed\n", "\n", "We create an ATOM feed with one entry to which we append the extension created above." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import datetime\n", "\n", "atom = Atom.from_template()\n", "atom.set_identifier(metadata['identifier'])\n", "atom.set_title_text(\"Title for MY_PRODUCT\")\n", "atom.set_summary_text(\"This is the summary for MY_PRODUCT\")\n", "atom.set_dcdate(\"{0}/{1}\".format(metadata['startdate'], metadata['enddate']))\n", "atom.set_published(\"{0}Z\".format(datetime.datetime.now().isoformat()))\n", "\n", "atom.add_extension(eo)\n", "\n", "atom.set_enclosure_link(\"https://store.terradue.com/myindex/MY_PRODUCT.tif\", \"Location on storage\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Show the resulting ATOM feed:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(etree.tostring(atom.root, pretty_print=True))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 6. Post the ATOM feed\n", "\n", "We post the ATOM feed to an index on the catalog (variables are defined in the first step)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import requests\n", "\n", "request = requests.post(endpoint,\n", " headers={\"Content-Type\": \"application/atom+xml\", \"Accept\": \"application/xml\"},\n", " auth=(username, api_key),\n", " data=atom.to_string()\n", ")\n", "\n", "if request.status_code == 200:\n", " print('Data item updated at {0}/search?uid={1}&apikey={2} ({3})'.format(endpoint, atom.get_identifier(), api_key, str(request.status_code)))\n", "else:\n", " print('Data item NOT updated at {0}/search?uid={1}&apikey={2} ({3})'.format(endpoint, atom.get_identifier(), api_key, str(request.status_code)))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If a product URL with status 200 is displayed, the ATOM feed has been successfully uploaded and the product information is available on the Terradue catalog.\n", "\n", "**END**" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.12" } }, "nbformat": 4, "nbformat_minor": 2 }