The challenge of Artificial Intelligence: fast delivery, continuous development

Jose Luis Sánchez Maroñas
Several authors

15 of August of 2023

Developing an Artificial Intelligence project, understanding that this complex concept also encompasses computational statistics and machine learning, has different phases and a very high component of uncertainty stemming from the fact that not every problem can be solved.

To face this reality with optimism, it is worth coming up with solutions that generate direct value, that make a difference. In my work as part of Ferrovial’s Digital Hub, we soon discovered that, on many occasions, getting to deliver value was expensive with these projects; they required a lot of time for what is contributed to be seen.

Therefore, we analyzed the entire process to understand the impacts data science has and how the benefits of the AI we apply can be maximized. We landed on a plan in four installments: accessing the data, visualization, and treatment, selecting the best model, and putting it into production.

Access to data

“If you want a model, you need data”

For this first phase, we need raw data – that is, to understand how it comes out of the data-generating machine (whether it’s a sensor, a person charging expenses, a drone making videos…) to be able to model the system itself.

The data must be available, and the box where it is stored must be accessible (whether this box is an Excel file, an SQL database, a DataLake …) in order to have up-to-date data that is usually worked with.

We need a few more things, like the description of the data (what does each variable mean?) and the update frequency, but for now, that’s enough requirements at this stage.

Visualization and Treatment

“If you want a model, you need to understand the data”

This stage is recurrent in every project of this type and will be returned to many times.

You always start by understanding the data you have, carry out a few problems you try to solve with simple techniques, such as eliminating some observations because they are doubtful, selecting the images that really show something… You also face other problems where you have to use very manual techniques, such as marking where exactly a license plate is in an image, and other more complicated ones, such as deciding whether a behavior is normal or not in a set of electrical generation).

But you will always come back.

You will always come back because when the model does not work, you’ll have to see what happens to the data.

You always come back because when the model is biased, you’ll have to see what happens to the data.

Always, always, always…

Selection of the best model

“If you want a model, you need a model”

Every problem, every goal, will have a model with which it will be achieved better.

The key part is to understand the problem because the model’s objective will be defined based on this.

Once you have an objective, you must select the best model or, at least, a model that meets the necessary conditions to alleviate, if not completely solve, the initial problem.

And for that, you test, you divide the data so that you can test each model as if you were going to use it.

You define a metric that you want to optimize.

And after testing and testing, you choose the best one, and… is that it?

Putting it into production

“If you want a model, you want to use the model”

And it can be used in different ways.

It can be used periodically: the system can get new data, make predictions, and store them somewhere where you can consume them.

It can be used on request: the system can be ready for you to send it data at any time, and it will return the prediction to you.

And it can be used constantly: the system can always be making predictions because you’re sending it data non-stop.

Combining the steps to ensure impact

How do we accelerate all these shared steps in the different projects to deliver value quickly?

We turn to Stack MLOps, a set of tools that make it possible to accelerate the delivery time of an MVP, minimum viable product, by 90%:

Providing access to data
Automating model selection
Automating putting it into production

And not only that!

Once the automatic model is delivered, these tools allow it to be iterated in the background, conducting experiments and comparing new models developed ad-hoc for the particular objective and switching from one model to another in a simple way.

The way of working? You are connected to the data, process it in the way you need, define the problem and the metrics to optimize thoroughly, and… you now have the model ready for production!

And after that?

Then the process of continually improving the models begins. The data scientists are behind the scenes, sharing the metrics they get with unseen models, with new approaches to the problem, and when they improve the model in production, they change it without difficulty.

This is the way we work in data science at our Artificial Intelligence Center of Excellence.

Getting value fast, increasing it day by day.

Artificial intelligence Big Data Innovation

There are no comments yet

Subscribe to our newsletter and you will receive only good stories

Required field

Incorrect mail format. Ex: ejemplo@mail.com

Legal terms and conditions

Don't forget to read this!

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

I hereby grant my consent to receive Ferrovial’s newsletters according to the Privacy policy and Legal notice.

Required field

I authorize the processing of my data for the purpose of enabling my registration as a user. This registration allows me to save my readings and continue at another time; to publish comments, together with the data that I may provide for this purpose; and to receive notifications about new posts, according to the categories previously selected for this purpose and new comments about the posts previously commented, in accordance with the Privacy policy.

Required field

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	11 months 29 days 23 hours 59 minutes	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category ''Advertisement''.
cookielawinfo-checkbox-analytics	11 months 29 days 23 hours 59 minutes	This cookies is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category ''Analytics''.
cookielawinfo-checkbox-language	11 months 29 days 23 hours 59 minutes	This cookies is set by GDPR Cookie Consent WordPress Plugin. The cookies will remember language preferences.
cookielawinfo-checkbox-necessary	12 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-non-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non Necessary".
csrftoken	11 months	This cookie is associated with Django web development platform for python. Used to help protect the website against Cross-Site Request Forgery attacks
lang		This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
PHPSESSID		This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
wp-wpml_current_language	1 day

Cookie	Duration	Description
_csrf		Anti Cross-site request forgery cookie.
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, camapign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assigns a randoly generated number to identify unique visitors.
_gat	1 minute	This cookies is installed by Google Universal Analytics to throttle the request rate to limit the colllection of data on high traffic sites.
_gat_gtag_UA_5784146_31	1 minute	Google Used to distinguish users.
_gat_UA-141180000-1	1 minute	This is a pattern type cookie set by Google Analytics, where the pattern element on the name contains the unique identity number of the account or website it relates to. It appears to be a variation of the _gat cookie which is used to limit the amount of data recorded by Google on high traffic volume websites.
_gat_UA-20934186-10	1 minute	This is a pattern type cookie set by Google Analytics, where the pattern element on the name contains the unique identity number of the account or website it relates to. It appears to be a variation of the _gat cookie which is used to limit the amount of data recorded by Google on high traffic volume websites.
_gat_UA-5826449-38		Used by Google Analytics to throttle request rate
_gat_UA-58630905-1	1 minute	Used by Google Analytics to monitor the rate of requests
_gat_UA-70491628-1	1 minute	This is a pattern type cookie set by Google Analytics, where the pattern element on the name contains the unique identity number of the account or website it relates to. It appears to be a variation of the _gat cookie which is used to limit the amount of data recorded by Google on high traffic volume websites.
_gcl_au	2 months	Used by Google AdSense to experiment with advertising efficiency across websites using its services.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form.
_hjAbsoluteSessionInProgress	30 minutes	This cookie is used to detect the first pageview session of a user. This is a True/False flag set by the cookie.
_hjCachedUserAttributes	Session	This cookie stores User Attributes which are sent through the Hotjar Identify API, whenever the user is not in the sample. These attributes will only be saved if the user interacts with a Hotjar Feedback tool.
_hjClosedSurveyInvites	365 days	Hotjar cookie that is set once a visitor interacts with an External Link Survey invitation modal. It is used to ensure that the same invite does not reappear if it has already been shown.
_hjDonePolls	365 days	Hotjar cookie that is set once a visitor completes a survey using the On-site Survey widget. It is used to ensure that the same survey does not reappear if it has already been filled in.
_hjid	365 days	Hotjar cookie that is set when the customer first lands on a page with the Hotjar script. It is used to persist the Hotjar User ID, unique to that site on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.
_hjIncludedInPageviewSample	30 minutes	This cookie is set to let Hotjar know whether that visitor is included in the data sampling defined by your site's pageview limit.
_hjIncludedInSessionSample	30 minutes	This cookie is set to let Hotjar know whether that visitor is included in the data sampling defined by your site's daily session limit
_hjLocalStorageTest	Less than 100ms	This cookie is used to check if the Hotjar Tracking Script can use local storage. If it can, a value of 1 is set in this cookie. The data stored in_hjLocalStorageTest has no expiration time, but it is deleted almost immediately after it is created.
_hjMinimizedPolls	365 days	Hotjar cookie that is set once a visitor minimizes an On-site Survey widget. It is used to ensure that the widget stays minimized when the visitor navigates through your site.
_hjRecordingLastActivity	Session	This should be found in Session storage (as opposed to cookies). This gets updated when a visitor recording starts and when data is sent through the WebSocket (the visitor performs an action that Hotjar records).
_hjShownFeedbackMessage	365 days	Hotjar cookie that is set when a visitor minimizes or completes Incoming Feedback. This is done so that the Incoming Feedback will load as minimized immediately if the visitor navigates to another page where it is set to show.
_hjTLDTest	Session	When the Hotjar script executes we try to determine the most generic cookie path we should use, instead of the page hostname. This is done so that cookies can be shared across subdomains (where applicable). To determine this, we try to store the _hjTLDTest cookie for different URL substring alternatives until it fails. After this check, the cookie is removed.
_hjUserAttributesHash	Session	User Attributes sent through the Hotjar Identify API are cached for the duration of the session in order to know when an attribute has changed and needs to be updated.
_smvs	23 hours 59 minutes
_uetsid	1 day	This is a cookie used by Microsoft Bing Ads and it is a tracking cookie. Allows you to interact with a user who has already visited our website.
_uetvid	2 weeks	Cookie installed by Google Tag Manager to store and track visits between sites.
apbct_visible_fields
apbct_visible_fields_count
ct_checkjs
ct_fkp_timestamp
ct_pointer_data
ct_ps_timestamp
ct_timezone
dtCookie	Session
GPS	30 minutos	This cookie is set by Youtube and registers a unique ID for tracking users based on their geographical location.
lumesse_language	50 years ago	This cookie determines language of Application Process user interface (labels, interface etc.)
MR	1 week	This cookie is used to measure the use of the website for analytical purposes.
test_cookie	14 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the users' browser supports cookies.

Cookie	Duration	Description
_fbp	2 months 28 days 23 hours 59 minutes	This cookie is set by Facebook to deliver advertisement when they are on Facebook or a digital platform powered by Facebook advertising after visiting this website.
everest_g_v2	1 year	The cookie is set in eversttech.net domain. The purpose of the cookie is to assign clicks to other events on the customer's website.
fr	2 months 28 days 23 hours 59 minutes	The cookie is set by Facebook to show relevant advertisments to the users and measure and improve the advertisements. The cookie also tracks the behavior of the user across the web on sites that have Facebook pixel or Facebook social plugin.
IDE	2 years	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisements before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
lms_ads	30 days	It is used to identify LinkedIn members from designated countries for advertising purposes.
mid	9 years	The cookie is set by Instagram. The cookie is used to distinguish users and to show relevant content, for better user experience and security.
MUID	1 year	Used by Microsoft as a unique identifier. The cookie is set using embedded Microsoft scripts. The purpose of this cookie is to synchronize the identifier in many different Microsoft domains to allow user tracking.
NID	6 meses	This cookie is used to a profile based on user's interest and display personalized ads to the users.
personalization_id	2 years	This cookie is set by twitter.com. It is used to integrate the sharing features of this social network. It also stores information about how the user uses the website for tracking and targeting.
uid	1 year	This cookie is used to measure the number and behavior of website visitors anonymously. The data includes the number of visits, the average duration of the visit on the website, the pages visited, etc. in order to better understand user preferences for targeted ads.
VISITOR_INFO1_LIVE	5 months	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.
YSC	Session	This cookie is set by Youtube and is used to track views of embedded videos.

The challenge of Artificial Intelligence: fast delivery, continuous development

Access to data

Visualization and Treatment

Selection of the best model

Putting it into production

Combining the steps to ensure impact