phone

    • chevron_right

      Erlang Solutions: How IoT is Revolutionising Supply Chain Management

      news.movim.eu / PlanetJabber · Thursday, 20 July, 2023 - 13:12 · 5 minutes

    As global supply chains continue to face significant disruptions, many businesses are turning to IoT to access greater visibility, reactivity, and streamlined operations.

    Unforeseen geopolitical conflicts, economic pressures due to inflation and severe climate change events have all contributed to an uncertain and costly supply chain environment for companies worldwide in 2023.

    To soften some of these impacts, and to work towards a more intelligent, forward-thinking form of supply chain management, industry leaders continue to turn to the benefits offered by the Internet of Things, or IoT.
    By embracing IoT, your company can transform a scattered supply chain into a fully connected network. In doing so, you’ll be able to access a wide range of benefits like increased visibility and superior inventory management, whilst preparing your company’s foundations for the future of distribution.

    The Future of Distribution: Recent IoT Impacts on the Supply Chain

    Whilst it certainly represents the future of supply chain logistics, IoT adoption across multiple different industries has already happened. A recent survey by PwC found that in 2023, 46% of companies have already invested in IoT to the point where it’s fully adopted by their supply chain, second only to cloud-based data platforms.

    Adoption to win future investment in supply chains.

    https://www.pwc.com/us/en/services/consulting/business-transformation/digital-supply-chain-survey/supply-chain-tech.html

    When predicting the future of supply chain technology back in 2021, Gartner also claimed that 50% of organisations will have invested in solutions that support AI and advanced analytical capabilities, like IoT, by 2024. They also predicted that by 2025, 50% of organisations will have employed a technology leadership role who will report directly to their chief supply chain officer (CSCO).

    The existence and normalisation of a CSCO role itself evidences that supply chain management now plays a vital role in c-suite level decision-making for many global businesses. By predicting that CSCOs will soon be naturally reinforced by a senior tech leader in half of all organisations, Gartner has also shown that effective supply chain management today must be intertwined with new technologies like IoT.

    To better understand the accuracy and importance of this prediction, it’s vital to explore the role IoT plays in supply chain management at present.

    The Role of IoT in Supply Chain Management

    IoT can be applied to practically every stage of a supply chain. In fact, due to its communicative nature, it’s advisable to apply IoT across an entire supply chain to embrace the benefits of a fully connected supply network.

    The first, and perhaps most well-known, role of IoT within the supply chain is its capacity to provide real-time location tracking. This is often used to allow customers to track packages en route to their destination, but internally this feature also ensures companies have total visibility over all stages of distribution.

    Increasing visibility means IoT can contribute to more accurate arrival time estimations. This also means businesses can quickly react to any unexpected issues that arise within their supply chain.

    In doing so, IoT can help companies achieve greater risk mitigation, whilst simultaneously providing insights that can support contingency planning. One unique example of a company benefitting from IoT risk mitigation — within both supply chain management as well as customer experience — is Volvo. They now use IoT to track vehicle delivery as well as to provide stolen vehicle tracking for customers.

    Monitoring can also extend to items in storage, which is particularly important for companies shipping perishable goods. In these instances, IoT allows for visibility and control over the environmental conditions of stored packages and equipment. Finally, individual shipments can also be located, speeding up the process of sourcing, identifying and managing goods when held in warehouses or distribution centres.

    Many leading companies have already embraced IoT storage monitoring; for example, Ericsson recently implemented digital asset-tracking solutions in their new 5G smart factories to track critical asset locations.

    The Benefits of Utilising IoT in Supply Chain Management

    The following represent a handful of the key benefits your company could access by investing in IoT across your supply chain.

    • New, Visible Opportunities

    Many of the roles of IoT listed above contribute towards increased visibility over your supply chain. This level of visibility doesn’t just improve resilience and streamline operations; it can also provide insights that reveal entirely new opportunities.

    These could include opening the door for automation, smart packaging that enables customers to interact directly with products, or unearthing potential improvements like route remapping that can further optimise your overall chain.

    • Improved Communication Internally and With Customers

    The data analysis capabilities offered by IoT allow your teams to better communicate with each other, as each team can access detailed information on the current nature of your supply chain.

    This extends to the communication you can offer customers, enhancing their overall customer experience thanks to clear delivery times, the ability to provide alternative arrangements and quick resolutions to problems or disruptions.

    • Meeting Regulations and Sustainability Requirements

    IoT can provide a digital footprint of your supply chain, which is easier to optimise and can ensure you provide accurate reporting to meet ever-changing regulations.

    Being able to optimise and streamline your supply chain can also mitigate unnecessary emissions, which can help your company work towards more sustainable operations. Gartner’s study found over half of today’s customers will only do business with companies who practice environmental and social sustainability, and the importance of engaging in sustainable supply chain management will only grow in importance in the coming years.

    • A Cost-Effective Solution

    Technology adoption, particularly of new or emerging technologies companywide, is often an expensive undertaking.
    However, IoT represents a proven solution and a relatively affordable technology to implement (with future innovations likely to lower costs further ), making it the ideal choice for budget-minded decision-makers.

    How to Implement and Scale Supply Chain IoT

    Effective IoT supply chain investment must be scalable, and accessing the above benefits demands that decision-makers solicit support from experts in the space.

    Optimising an entire chain requires a reliable, proven MQTT Messaging Engine like EMQ X .

    By using EMQ X, your business can connect over 50 million different devices, with the potential to handle tens of millions of concurrent clients at any one time. This makes EMQ X massively scalable, which is why it’s already the IoT supply chain management solution of choice for hundreds of leading companies worldwide.

    Our IoT Erlang Solutions specialists have worked closely with EMQ X, with over 20 years of experience building real-time distributed systems. In addition to consulting on projects across any stage, we also offer regular health checks, EMQ X support services and monitoring to ensure your system remains reliable.


    If you’d like to learn more about how to access Erlang Solutions supply chain optimisations through EMQ X, make sure to contact our team today .

    The post How IoT is Revolutionising Supply Chain Management appeared first on Erlang Solutions .

    • wifi_tethering open_in_new

      This post is public

      www.erlang-solutions.com /blog/how-iot-is-revolutionising-supply-chain-management/

    • chevron_right

      Isode: Icon-PEP 2.0 – New Capabilities

      news.movim.eu / PlanetJabber · Tuesday, 18 July, 2023 - 15:47

    Icon-PEP is used to enable the use of IP applications over HF networks. Using STANAG 5066 Link Layer as an interface.

    Listed below are the changes brought in with 2.0.

    Web Management

    A web interface is provided which includes:

    • Full configuration of Icon-PEP
    • TLS (HTTPS) access and configuration including bootstrap with self signed certificate and identity management.
    • Control interface to enable or disable Icon-PEP
    • Monitoring to include:
      • Access to all logging metrics
      • Monitoring GRE traffic with peered routers
      • Monitoring IP Client traffic to STANAG 5066
      • Monitoring DNS traffic
      • Monitoring TCP traffic with details of HTTP queries and responses

    Profiler Enhancement

    OAuth support added to control access to monitoring and configuration.

    NAT Mode

    A NAT (Network Address Translation) mode is introduced which supports Mobile Unit mobility for traffic initiated by Mobile Unit.   Inbound IP or SLEP (TCP) traffic will have address mapped so that traffic on shore side appears to come from the local node.  This avoids the need for complex IP routing to support traffic to Mobile Units not using fixed IP routing.

    Other Features

    • Product Activation, including control of the number of Units
    • Filtering (previously IP client only) extended to SLEP/TCP
    • wifi_tethering open_in_new

      This post is public

      www.isode.com /company/wordpress/icon-pep-2-0-new-capabilities/

    • chevron_right

      Isode: Cobalt 1.4 – New Capabilities

      news.movim.eu / PlanetJabber · Tuesday, 18 July, 2023 - 14:58 · 1 minute

    Cobalt proides a web interface for provisioning users and roles in an LDAP directory. It enables the easy deployment of XMPP, Email and Military Messaging systems.

    Listed below are the changes brought in with 1.4.

    HSM Support

    Cobalt is Isode’s tool for managing PKCS#11 Hardware Security Modules (HSM) which may be used to provide improved server security by protecting PKI private keys.

    • Cobalt provides a generic capability to initialize  HSMs and view keys
      • Multiple HSMs can be configured and one set to active
      • Tested with Nitrokey, Yubikey, SoftHSM and Gemalto networked HSM
    • Enables key pair generation and Certificate Signing Request (CSR) interaction with Certificate Authority (CA)
    • Support for S/MIME signing and encryption
      • User identities for email
      • Organization and Role identities for military messaging
    • Server identities that can be used for TLS with Isode servers

    Isode Servers

    A new tab for Isode servers is added that:

    • Enables HSM identities to be provisioned
    • Enables a password to be set, which is needed for Isode servers that bind to directory to obtain authorization, authentication and other information
    • Facilitates adding Isode servers to a special directory access control group, that enables passwords (usually SCRAM hashed) to be read, to enable SCRAM and other SASL mechanisms to be used by the application

    Profiler Enhancement

    • Extend the SIC rule so that multiple SICs or SIC patterns can be set in a single rule

    • wifi_tethering open_in_new

      This post is public

      www.isode.com /company/wordpress/cobalt-1-4-new-capabilities/

    • chevron_right

      Erlang Solutions: Re-implement our first blog scrapper with Crawly 0.15.0

      news.movim.eu / PlanetJabber · Tuesday, 25 April, 2023 - 16:02 · 14 minutes

    It has been almost four years since my first article about scraping with Elixir and Crawly was published. Since then, many changes have occurred, the most significant being Erlang Solution’s blog design update. As a result, the 2019 tutorial is no longer functional.

    This situation provided an excellent opportunity to update the original work and re-implement the Crawler using the new version of Crawly. By doing so, the tutorial will showcase several new features added to Crawly over the years and, more importantly, provide a functional version to the community. Hopefully, this updated tutorial will be beneficial to all.

    First of all, why it’s broken now?

    This situation is reasonably expected! When a website gets a new design, usually they redo everything—the new layout results in a new HTML which makes all old CSS/XPath selectors obselete, not even speaking about new URL schemes. As a result, the XPath/CSS selectors that were working before referred to nothing after the redesign, so we have to start from the very beginning. What a shame!

    But of course, the web is done for more than just crawling. The web is done for people, not robots, so let’s adapt our robots!

    Our experience from a large-scale scraping platform is that a successful business usually runs at least one complete redesign every two years. More minor updates will occur even more often, but remember that even minor updates harm your web scrapers.

    Getting started

    Usually, I recommend starting by following the Quickstart guide from Crawly’s documentation pages . However, this time I have something else in mind. I want to show you the Crawly standalone version.

    Make it simple. In some cases, you need the data that can be extracted from a relatively simple source. In these situations, it might be quite beneficial to avoid bootstrapping all the Elixir stuff (new project, config, libs, dependencies). The idea is to deliver you data that other applications can consume without setting up.

    Of course, the approach will have some limitations and only work for simple projects at this stage. Some may get inspired by this article and improve it so that the following readers will be amazed by new possibilities. In any case, let’s get straight to it now!

    Bootstrapping 2.0

    As promised, the simplified (compare it with the previous setup described here )version of the setup:

    1. Create a directory for your project: mkdir erlang_solutions_blog
    2. Create a subdirectory that will contain the code of your spiders: mkdir erlang_solutions_blog/spiders
    3. Now, knowing that we want to extract the following fields: title, author , publishing_date, URL, article_body . Let’s define the following configuration for your project (erlang_solutions_blog/crawly.config):
    
    [{crawly, [
       {closespider_itemcount, 100},
       {closespider_timeout, 5},
       {concurrent_requests_per_domain, 15},
    
       {middlewares, [
               'Elixir.Crawly.Middlewares.DomainFilter',
               'Elixir.Crawly.Middlewares.UniqueRequest',
               'Elixir.Crawly.Middlewares.RobotsTxt',
               {'Elixir.Crawly.Middlewares.UserAgent', [
                   {user_agents, [
                       <<"Mozilla/5.0 (Macintosh; Intel Mac OS X x.y; rv:42.0) Gecko/20100101 Firefox/42.0">>,
                       <<"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36">>
                       ]
                   }]
               }
           ]
       },
    
       {pipelines, [
               {'Elixir.Crawly.Pipelines.Validate', [{fields, [title, author, publishing_date, url, article_body]}]},
               {'Elixir.Crawly.Pipelines.DuplicatesFilter', [{item_id, title}]},
               {'Elixir.Crawly.Pipelines.JSONEncoder'},
               {'Elixir.Crawly.Pipelines.WriteToFile', [{folder, <<"/tmp">>}, {extension, <<"jl">>}]}
           ]
       }]
    }].
    
    

    You probably have noticed that this looks like an Erlang configuration file, which is the case. I would say that it’s not the perfect solution, and one of the possible ways is to simplify it so it’s possible to configure the project more simply. If you have ideas — write me on Github’s discussions https://github.com/elixir-crawly/crawly/discussions .

    4. The basic configuration is now done, and we can run the Crawly application, to see that we can start it this way:

    docker run --name crawly 
    -d -p 4001:4001 -v $(pwd)/spiders:/app/spiders 
    -v $(pwd)/crawly.config:/app/config/crawly.config 
    oltarasenko/crawly:0.15.0

    Notes:

    • 4001 — is the default HTTP port used for spiders management, so we need to forward data to it
    • The spiders’ directory is an expected storage of spider files that will be added to the application later on.
    • Finally, the ugly configuration file is also mounted inside the Crawly container.

    Now you can see the Crawly Management User interface on the localhost:4001

    Crawly Management Tool

    Working on a new spider

    Now, let’s define the spider itself. Let’s start with the following boilerplate code (put it into erlang_solutions_blog/spiders/esl.ex ):

    defmodule ESLSpider do
     use Crawly.Spider
    
     @impl Crawly.Spider
     def init() do
       [start_urls: ["https://www.erlang-solutions.com/"]]
     end
    
     @impl Crawly.Spider
     def base_url(), do: "https://www.erlang-solutions.com"
    
     @impl Crawly.Spider
     def parse_item(response) do
       %{items: [], requests: []}
     end
    end

    This code defines an “ESLSpider ” module that uses the “Crawly.Spider” behavior.

    The behavior requires three functions to be implemented:

    teinit(), base_url(), and parse_item(response).

    The “init()” function returns a list containing a single key-value pair. The key is “start_urls” and the value is a list containing a single URL string: “ https://www.erlang-solutions.com/ ” This means that the spider will start crawling from this URL.

    The “base_url()” function returns a string representing the base URL for the spider, used to filter out requests that go outside of erlang-solutions.com website.

    The `parse_item(response)` function takes a response object as an argument and returns a map containing two keys: `items` and `requests`

    Once the code is saved, we can run it via the Web interface (it will be required to re-start a docker container or click the Reload spiders button in the Web interface).

    Crawly Management Tool

    Working on a new spider

    Now, let’s define the spider itself. Let’s start with the following boilerplate code (put it into erlang_solutions_blog/spiders/esl.ex ):

    defmodule ESLSpider do
     use Crawly.Spider
    
     @impl Crawly.Spider
     def init() do
       [start_urls: ["https://www.erlang-solutions.com/"]]
     end
    
     @impl Crawly.Spider
     def base_url(), do: "https://www.erlang-solutions.com"
    
     @impl Crawly.Spider
     def parse_item(response) do
       %{items: [], requests: []}
     end
    end

    This code defines an “ESLSpider ” module that uses the “Crawly.Spider” behavior.

    The behavior requires three functions to be implemented:

    teinit(), base_url(), and parse_item(response).

    The “init()” function returns a list containing a single key-value pair. The key is “start_urls” and the value is a list containing a single URL string: “ https://www.erlang-solutions.com/ ” This means that the spider will start crawling from this URL.

    The “base_url()” function returns a string representing the base URL for the spider, used to filter out requests that go outside of erlang-solutions.com website.

    The `parse_item(response)` function takes a response object as an argument and returns a map containing two keys: `items` and `requests`

    Once the code is saved, we can run it via the Web interface (it will be required to re-start a docker container or click the Reload spiders button in the Web interface).

    New Crawly Management UI

    Once the job is started, you can review the Scheduled Requests, Logs, or Extracted Items.

    Parsing the page

    Now we find CSS selectors to extract the needed data. The same approach is already described here https://www.erlang-solutions.com/blog/web-scraping-with-elixir/ under extracting the data section. I think one of the best ways to find relevant CSS selectors is by just using Google Chrome’s inspect option:

    So let’s connect to the Crawly Shell and fetch data using the fetcher, extracting this title:

    docker exec -it crawly /app/bin/crawly remote

    1> response = Crawly.fetch("https://www.erlang-solutions.com/blog/web-scraping-with-elixir/")
    2> document = Floki.parse_document!(response.body)
    4> title_tag = Floki.find(document, ".page-title-sm")
    [{"h1", [{"class", "page-title-sm mb-sm"}], ["Web scraping with Elixir"]}]
    5> title = Floki.text(title_tag)
    "Web scraping with Elixir"
    
    

    We are going to extract all items this way. In the end, we came up with the following map of selectors representing the expected item:

    item =
     %{
       url: response.request_url,
       title: Floki.find(document, ".page-title-sm") |> Floki.text(),
       article_body: Floki.find(document, ".default-content") |> Floki.text(),
       author: Floki.find(document, ".post-info__author") |> Floki.text(),
       publishing_date: Floki.find(document, ".header-inner .post-info .post-info__item span") |> Floki.text()
      }
    
    requests = Enum.map(
     Floki.find(document, ".link-to-all") |> Floki.attribute("href"),
     fn url -> Crawly.Utils.request_from_url(url) end
    )
    
    

    At the end of it, we came up with the following code representing the spider:

    defmodule ESLSpider do
     use Crawly.Spider
    
     @impl Crawly.Spider
     def init() do
       [
         start_urls: [
           "https://www.erlang-solutions.com/blog/web-scraping-with-elixir/",
           "https://www.erlang-solutions.com/blog/which-companies-are-using-elixir-and-why-mytopdogstatus/"
         ]
       ]
     end
    
     @impl Crawly.Spider
     def base_url(), do: "https://www.erlang-solutions.com"
    
     @impl Crawly.Spider
     def parse_item(response) do
       {:ok, document} = Floki.parse_document(response.body)
    
       requests = Enum.map(
         Floki.find(document, ".link-to-all") |> Floki.attribute("href"),
         fn url -> Crawly.Utils.request_from_url(url) end
         )
    
       item = %{
         url: response.request_url,
         title: Floki.find(document, ".page-title-sm") |> Floki.text(),
         article_body: Floki.find(document, ".default-content") |> Floki.text(),
         author: Floki.find(document, ".post-info__author") |> Floki.text(),
         publishing_date: Floki.find(document, ".header-inner .post-info .post-info__item span") |> Floki.text()
       }
       %{items: [item], requests: requests}
     end
    end
    
    
    

    That’s all, folks! Thanks for reading!

    Well, not really. Let’s schedule this version of the spider again, and let’s see the results:

    Scraping results

    As you can see, the spider could only extract 34 items. This is quite interesting, as it’s pretty clear that Erlang Solution’s blog contains way more items. So why do we have only this amount? Can anything be done to improve it?

    Debugging your spider

    Some intelligent developers write everything just once, and everything works. Other people like me have to spend time debugging the code.

    In my case, I start with exploring logs. There is something there I don’t like:

    08:23:37.417 [info] Dropping item: %{article_body: “Scalable and Reliable Real-time MQTT Messaging Engine for IoT in the 5G Era.We work with proven, world leading technologies that provide a highly scalable, highly available distributed message broker for all major IoT protocols, as well as M2M and mobile applications.Available virtually everywhere with real-time system monitoring and management ability, it can handle tens of millions of concurrent clients.Today, more than 5,000 enterprise users are trusting EMQ X to connect more than 50 million devices.As well as being trusted experts in EMQ x, we also have 20 years of experience building reliable, fault-tolerant, real-time distributed systems. Our experts are able to guide you through any stage of the project to ensure your system can scale with confidence. Whether you†™ re hunting for a suspected bug, or doing due diligence to future proof your system, we†™ re here to help. Our world-leading team will deep dive into your system providing an in-depth report of recommendations. This gives you full visibility on the vulnerabilities of your system and how to improve it. Connected devices play an increasingly vital role in major infrastructure and the daily lives of the end user. To provide our clients with peace of mind, our support agreements ensure an expert is on hand to minimise the length and damage in the event of a disruption. Catching a disruption before it occurs is always cheaper and less time consuming. WombatOAM is specifically designed for the monitoring and maintenance of BEAM-based systems (including EMQ x). This provides you with powerful visibility and custom alerts to stop issues before they occur. As well as being trusted experts in EMQ x, we also have 20 years of experience building reliable, fault-tolerant, real-time distributed systems. Our experts are able to guide you through any stage of the project to ensure your system can scale with confidence. Whether you†™ re hunting for a suspected bug, or doing due diligence to future proof your system, we†™ re here to help. Our world-leading team will deep dive into your system providing an in-depth report of recommendations. This gives you full visibility on the vulnerabilities of your system and how to improve it. Connected devices play an increasingly vital role in major infrastructure and the daily lives of the end user. To provide our clients with peace of mind, our support agreements ensure an expert is on hand to minimise the length and damage in the event of a disruption. Catching a disruption before it occurs is always cheaper and less time consuming. WombatOAM is specifically designed for the monitoring and maintenance of BEAM-based systems (including EMQ x). This provides you with powerful visibility and custom alerts to stop issues before they occur. Because it†™ s written in Erlang!With it†™ s Erlang/OTP design, EMQ X fuses some of the best qualities of Erlang. A single node broker can sustain one million concurrent connections…but a single EMQ X cluster – which contains multiple nodes – can support tens of millions of concurrent connections. Inside this cluster, routing and broker nodes are deployed independently to increase the routing efficiency. Control channels and data channels are also separated – significantly improving the performance of message forwarding. EMQ X works on a soft real-time basis. No matter how many simultaneous requests are going through the system, the latency is guaranteed.Here†™ s how EMQ X can help with your IoT messaging needs?Erlang Solutions exists to build transformative solutions for the world†™ s most ambitious companies, by providing user-focused consultancy, high tech capabilities and diverse communities. Let†™ s talk about how we can help you.”, author: “”, publishing_date: “”, title: “”, url: “https://www.erlang-solutions.com/capabilities/emqx/”}. Reason: missing required fields

    The line above indicates that the spider has dropped an article, which is not an article but is a general page. We want to exclude these URLs from the route of our bot.

    Try to avoid creating unnecessary loads on a website when doing crawling activities.

    The following lines can achieve this:

    requests =
     Floki.find(document, ".link-to-all") |> Floki.attribute("href")
     |> Enum.filter(fn url -> String.contains?(url, "/blog/") end)
     |> Enum.map(&Crawly.Utils.request_from_url/1)
    

    Now, we can re-run the spider and see that we’re not hitting non-blog pages anymore (don’t forget to reload the spider’s code)!

    This optimised our crawler, but more was needed to extract more items. (Besides other things, it’s interesting to note that we can only get 35 articles from the “Keep reading” blog, which indicates some possible directions for improving the cross-linking inside the blog itself).

    Improving the extraction coverage

    When looking at the possibility of extracting more items, we should try finding a better source of links. One good way to do it is by exploring the blog’s homepage, potentially with JavaScript turned off. Here is what I can see:

    Sometimes you need to switch JavaScript off to see more.

    As you can see, there are 14 Pages (only 12 of which are working), and every page contains nine articles. So we expect ~100–108 articles in total.

    So let’s try to use this pagination as a source of new links! I have updated the init() function, so it refers the blog’s index, and also parse_item so it can use the information found there:

    @impl Crawly.Spider
     def init() do
       [
         start_urls: [
           "https://www.erlang-solutions.com/blog/page/2/?pg=2",
           "https://www.erlang-solutions.com/blog/web-scraping-with-elixir/",
           "https://www.erlang-solutions.com/blog/which-companies-are-using-elixir-and-why-mytopdogstatus/"
         ]
       ]
     end
    
    @impl Crawly.Spider
    def parse_item(response) do
     {:ok, document} = Floki.parse_document(response.body)
    
     case String.contains?(response.request_url, "/blog/page/") do
       false -> parse_article_page(document, response.request_url)
       true -> parse_index_page(document, response.request_url)
     end
    end
    
    defp parse_index_page(document, _url) do
     index_pages =
       document
       |> Floki.find(".page a")
       |> Floki.attribute("href")
       |> Enum.map(&Crawly.Utils.request_from_url/1)
    
     blog_posts =
       Floki.find(document, ".grid-card__content a.btn-link")
       |> Floki.attribute("href")
       |> Enum.filter(fn url -> String.contains?(url, "/blog/") end)
       |> Enum.map(&Crawly.Utils.request_from_url/1)
    
       %{items: [], requests: index_pages ++ blog_posts }
    end
    
    defp parse_article_page(document, url) do
     requests =
       Floki.find(document, ".link-to-all")
       |> Floki.attribute("href")
       |> Enum.filter(fn url -> String.contains?(url, "/blog/") end)
       |> Enum.map(&Crawly.Utils.request_from_url/1)
    
     item = %{
       url: url,
       title: Floki.find(document, ".page-title-sm") |> Floki.text(),
       article_body: Floki.find(document, ".default-content") |> Floki.text(),
       author: Floki.find(document, ".post-info__author") |> Floki.text(),
       publishing_date: Floki.find(document, ".header-inner .post-info .post-info__item span") |> Floki.text()
     }
     %{items: [item], requests: requests}

    Running it again

    Now, finally, after adding all fixes, let’s reload the code and re-run the spider:

    So as you can see, we have extracted 114 items, which looks quite close to what we expected!

    Conclusion

    Honestly speaking — running an open-source project is a complex thing. We have spent almost four years building Crawly and progressed quite a bit with the possibilities. Adding some bugs as well.

    The example above shows how to run something with Elixir/Floki and a bit more complex process of debugging and fixing that sometimes appears in practice.

    We want to thank Erlang Solutions for supporting the development and allocating help when needed!

    The post Re-implement our first blog scrapper with Crawly 0.15.0 appeared first on Erlang Solutions .

    • wifi_tethering open_in_new

      This post is public

      www.erlang-solutions.com /blog/re-implement-our-first-blog-scrapper-with-crawly-0-15-0-2/

    • chevron_right

      Isode: Red/Black 2.0 – New Capabilities

      news.movim.eu / PlanetJabber · Friday, 21 April, 2023 - 15:46 · 1 minute

    This major release adds significant new functionality and improvements to Red/Black, a management tool that allows you to monitor and control devices and servers across a network, with a particular focus on HF Radio Systems.  A general summary is given in the white paper Red/Black Overview

    Switch Device

    Support added for Switch type devices, that can connect multiple devices and allow an operator (red or black side) to change switch connections.   Physical switch connectivity is configured by an administrator.  The switch column can be hidden, so that logical connectivity through the switch is shown.

    SNMP Support

    A device driver for SNMP devices is provided, including SNMPv3 authorization.   Abstract devices specifications are included in Red/Black for:

    • SNMP System MIB
    • SNMP Host MIB
    • SNMP UPS MIB
    • Leonardo HF 2000 radio
    • IES Antenna Switch
    • eLogic Radio Gateway

    Abstract devices specifications can be configured for other devices with suitable SNMP MIBs.

    Further details provided in the Isode WP “ Managing SNMP Devices in Red/Black “.

    Alert Handling

    The UI shows all devices that have Alerts which have not been handled by operator.   The UI enables an operator to see all un-handled alerts for a device and gives the ability to mark some or all alerts as handled.

    Device Parameter Display and Management

    A number of improvements have been made to the way device parameters are handled:

    • Improved general parameter display
    • Display in multiple columns, with selectable number of columns and choice of style, to better support devices with large numbers of parameters
    • Parameter grouping
    • Labelled integer support, so that semantics can be added to values
    • Configurable Colours
    • Display of parameter Units
    • Configurable parameter icons
    • Optimized UI for Device refresh; enable/disable; power off; and reset
    • Integer parameters can specify “interval”
    • Parameters with limited integer values can be selected as drop down

    Top Screen Display

    The top screen display is improved.

    • Modes of “Device” (monitoring)  and “Connectivity” with UIs optimized for these functions
    • Reduced clutter when no device is being examined
    • Allow columns to be hidden/restored so that the display can be tuned to operator needs
    • Show selected device parameters on top screen so that operator can see critical device parameters without needing to inspect the device details
    • UI clearly shows which links user can modify, according to operator or administrator rights
    • wifi_tethering open_in_new

      This post is public

      www.isode.com /company/wordpress/red-black-2-0-new-capabilities/

    • chevron_right

      ProcessOne: ejabberd 23.04

      news.movim.eu / PlanetJabber · Wednesday, 19 April, 2023 - 07:48 · 13 minutes

    This new ejabberd 23.04 release includes many improvements and bug fixes, as well as some new features.

    ejabberd 23.04

    A more detailed explanation of these topics and other features:

    Many improvements to SQL databases

    There are many improvements in the area of SQL databases (see #3980 and #3982 ):

    • Added support for migrating MySQL and MS SQL to new schema , fixed a long-standing bug, and many other improvements.
    • Regarding MS SQL, there are schema fixes, added support for new schema and the corresponding schema migration, along with other minor improvements and bugfixes.
    • The automated ejabberd tests now also run on updated schema databases, and support for running tests on MS SQL has been added.
    • and other minor SQL schema inconsistencies, removed unnecessary indexes and changed PostgreSQL SERIAL columns to BIGSERIAL columns.

    Please upgrade your existing SQL database, check the notes later in this document!

    Added mod_mam support for XEP-0425: Message Moderation

    XEP-0425: Message Moderation allows a Multi-User Chat (XEP-0045) moderator to moderate certain group chat messages, for example by removing them from the group chat history, as part of an effort to address and resolve issues such as message spam, inappropriate venue language, or revealing private personal information of others. It also allows moderators to correct a message on another user’s behalf, or flag a message as inappropriate, without having to retract it.

    Clients that currently support this XEP are Gajim , Converse.js , Monocles , and have read-only support Poezio and XMPP Web .

    New mod_muc_rtbl module

    This new module implements Real-Time Block List for MUC rooms. It works by monitoring remote pubsub nodes according to the specification described in xmppbl.org .

    captcha_url option now accepts auto value

    In recent ejabberd releases, captcha_cmd got support for macros (in ejabberd 22.10 ) and support for using modules (in ejabberd 23.01 ).

    Now captcha_url gets an improvement: if set to auto , it tries to detect the URL automatically, taking into account the ejabberd configuration. This is now the default. This should be good enough in most cases, but manually setting the URL may be necessary when using port forwarding or very specific setups.

    Erlang/OTP 19.3 is deprecated

    This is the last ejabberd release with support for Erlang/OTP 19.3. If you have not already done so, please upgrade to Erlang/OTP 20.0 or newer before the next ejabberd release. See the ejabberd 22.10 release announcement for more details.

    About the binary packages provided for ejabberd:

    • The binary installers and container images now use Erlang/OTP 25.3 and Elixir 1.14.3.
    • The mix , ecs and ejabberd container images now use Alpine 3.17.
    • The ejabberd container image now supports an alternative build method, useful to work around a problem in QEMU and Erlang 25 when building the image for the arm64 architecture.

    Erlang node name in ecs container image

    The ecs container image is built using the files from docker-ejabberd/ecs and published in docker.io/ejabberd/ecs . This image generally gets only minimal fixes, no major or breaking changes, but in this release it got one change that requires administrator intervention.

    The Erlang node name is now fixed to ejabberd@localhost by default, instead of being variable based on the container hostname. If you previously allowed ejabberd to choose its node name (which was random), it will now create a new mnesia database instead of using the previous one:

    $ docker exec -it ejabberd ls /home/ejabberd/database/
    ejabberd@1ca968a0301a
    ejabberd@localhost
    ...
    

    A simple solution is to create a container that provides ERLANG_NODE_ARG with the old erlang node name, for example:

    docker run ... -e ERLANG_NODE_ARG=ejabberd@1ca968a0301a
    

    or in docker-compose.yml

    version: '3.7'
    services:
      main:
        image: ejabberd/ecs
        environment:
          - ERLANG_NODE_ARG=ejabberd@1ca968a0301a
    

    Another solution is to change the mnesia node name in the mnesia spool files.

    Other improvements to the ecs container image

    In addition to the previously mentioned change to the default erlang node name, the ecs container image has received other improvements:

    • For each commit to the docker-ejabberd repository that affects ecs and mix container images, those images are uploaded as artifacts and are available for download in the corresponding runs .
    • When a new release is tagged in the docker-ejabberd repository, the image is automatically published to ghcr.io/processone/ecs , in addition to being manually published to the Docker Hub.
    • There are new sections in the ecs README file: Clustering and Clustering Example .

    Documentation Improvements

    In addition to the usual improvements and fixes, some sections of the ejabberd documentation have been improved:

    Acknowledgments

    We would like to thank the following people for their contributions to the source code, documentation, and translation for this release:

    And also to all the people who help solve doubts and problems in the ejabberd chatroom and issue tracker.

    Updating SQL Databases

    These notes allow you to apply the SQL database schema improvements in this ejabberd release to your existing SQL database. Please consider which database you are using and whether it is the default or the new schema .

    PostgreSQL new schema:

    Fixes a long-standing bug in the new schema on PostgreSQL. The fix for all existing affected installations is the same:

    ALTER TABLE vcard_search DROP CONSTRAINT vcard_search_pkey;
    ALTER TABLE vcard_search ADD PRIMARY KEY (server_host, lusername);
    

    PosgreSQL default or new schema:

    To convert columns to allow up to 2 billion rows in these tables. This conversion requires full table rebuilds and will take a long time if the tables already have many rows. Optional: This is not necessary if the tables will never grow large.

    ALTER TABLE archive ALTER COLUMN id TYPE BIGINT;
    ALTER TABLE privacy_list ALTER COLUMN id TYPE BIGINT;
    ALTER TABLE pubsub_node ALTER COLUMN nodeid TYPE BIGINT;
    ALTER TABLE pubsub_state ALTER COLUMN stateid TYPE BIGINT;
    ALTER TABLE spool ALTER COLUMN seq TYPE BIGINT;
    

    PostgreSQL or SQLite default schema:

    DROP INDEX i_rosteru_username;
    DROP INDEX i_sr_user_jid;
    DROP INDEX i_privacy_list_username;
    DROP INDEX i_private_storage_username;
    DROP INDEX i_muc_online_users_us;
    DROP INDEX i_route_domain;
    DROP INDEX i_mix_participant_chan_serv;
    DROP INDEX i_mix_subscription_chan_serv_ud;
    DROP INDEX i_mix_subscription_chan_serv;
    DROP INDEX i_mix_pam_us;
    

    PostgreSQL or SQLite new schema:

    DROP INDEX i_rosteru_sh_username;
    DROP INDEX i_sr_user_sh_jid;
    DROP INDEX i_privacy_list_sh_username;
    DROP INDEX i_private_storage_sh_username;
    DROP INDEX i_muc_online_users_us;
    DROP INDEX i_route_domain;
    DROP INDEX i_mix_participant_chan_serv;
    DROP INDEX i_mix_subscription_chan_serv_ud;
    DROP INDEX i_mix_subscription_chan_serv;
    DROP INDEX i_mix_pam_us;
    

    And now add index that might be missing

    In PostgreSQL:

    CREATE INDEX i_push_session_sh_username_timestamp ON push_session USING btree (server_host, username, timestamp);
    

    In SQLite:

    CREATE INDEX i_push_session_sh_username_timestamp ON push_session (server_host, username, timestamp);
    

    MySQL default schema:

    ALTER TABLE rosterusers DROP INDEX i_rosteru_username;
    ALTER TABLE sr_user DROP INDEX i_sr_user_jid;
    ALTER TABLE privacy_list DROP INDEX i_privacy_list_username;
    ALTER TABLE private_storage DROP INDEX i_private_storage_username;
    ALTER TABLE muc_online_users DROP INDEX i_muc_online_users_us;
    ALTER TABLE route DROP INDEX i_route_domain;
    ALTER TABLE mix_participant DROP INDEX i_mix_participant_chan_serv;
    ALTER TABLE mix_participant DROP INDEX i_mix_subscription_chan_serv_ud;
    ALTER TABLE mix_participant DROP INDEX i_mix_subscription_chan_serv;
    ALTER TABLE mix_pam DROP INDEX i_mix_pam_u;
    

    MySQL new schema:

    ALTER TABLE rosterusers DROP INDEX i_rosteru_sh_username;
    ALTER TABLE sr_user DROP INDEX i_sr_user_sh_jid;
    ALTER TABLE privacy_list DROP INDEX i_privacy_list_sh_username;
    ALTER TABLE private_storage DROP INDEX i_private_storage_sh_username;
    ALTER TABLE muc_online_users DROP INDEX i_muc_online_users_us;
    ALTER TABLE route DROP INDEX i_route_domain;
    ALTER TABLE mix_participant DROP INDEX i_mix_participant_chan_serv;
    ALTER TABLE mix_participant DROP INDEX i_mix_subscription_chan_serv_ud;
    ALTER TABLE mix_participant DROP INDEX i_mix_subscription_chan_serv;
    ALTER TABLE mix_pam DROP INDEX i_mix_pam_us;
    

    Add index that might be missing:

    CREATE INDEX i_push_session_sh_username_timestamp ON push_session (server_host, username(191), timestamp);
    

    MS SQL

    DROP INDEX [rosterusers_username] ON [rosterusers];
    DROP INDEX [sr_user_jid] ON [sr_user];
    DROP INDEX [privacy_list_username] ON [privacy_list];
    DROP INDEX [private_storage_username] ON [private_storage];
    DROP INDEX [muc_online_users_us] ON [muc_online_users];
    DROP INDEX [route_domain] ON [route];
    go
    

    MS SQL schema was missing some tables added in earlier versions of ejabberd:

    CREATE TABLE [dbo].[mix_channel] (
        [channel] [varchar] (250) NOT NULL,
        [service] [varchar] (250) NOT NULL,
        [username] [varchar] (250) NOT NULL,
        [domain] [varchar] (250) NOT NULL,
        [jid] [varchar] (250) NOT NULL,
        [hidden] [smallint] NOT NULL,
        [hmac_key] [text] NOT NULL,
        [created_at] [datetime] NOT NULL DEFAULT GETDATE()
    ) TEXTIMAGE_ON [PRIMARY];
    
    CREATE UNIQUE CLUSTERED INDEX [mix_channel] ON [mix_channel] (channel, service)
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
    
    CREATE INDEX [mix_channel_serv] ON [mix_channel] (service)
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
    
    CREATE TABLE [dbo].[mix_participant] (
        [channel] [varchar] (250) NOT NULL,
        [service] [varchar] (250) NOT NULL,
        [username] [varchar] (250) NOT NULL,
        [domain] [varchar] (250) NOT NULL,
        [jid] [varchar] (250) NOT NULL,
        [id] [text] NOT NULL,
        [nick] [text] NOT NULL,
        [created_at] [datetime] NOT NULL DEFAULT GETDATE()
    ) TEXTIMAGE_ON [PRIMARY];
    
    CREATE UNIQUE INDEX [mix_participant] ON [mix_participant] (channel, service, username, domain)
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
    
    CREATE INDEX [mix_participant_chan_serv] ON [mix_participant] (channel, service)
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
    
    CREATE TABLE [dbo].[mix_subscription] (
        [channel] [varchar] (250) NOT NULL,
        [service] [varchar] (250) NOT NULL,
        [username] [varchar] (250) NOT NULL,
        [domain] [varchar] (250) NOT NULL,
        [node] [varchar] (250) NOT NULL,
        [jid] [varchar] (250) NOT NULL
    );
    
    CREATE UNIQUE INDEX [mix_subscription] ON [mix_subscription] (channel, service, username, domain, node)
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
    
    CREATE INDEX [mix_subscription_chan_serv_ud] ON [mix_subscription] (channel, service, username, domain)
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
    
    CREATE INDEX [mix_subscription_chan_serv_node] ON [mix_subscription] (channel, service, node)
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
    
    CREATE INDEX [mix_subscription_chan_serv] ON [mix_subscription] (channel, service)
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
    
    CREATE TABLE [dbo].[mix_pam] (
        [username] [varchar] (250) NOT NULL,
        [channel] [varchar] (250) NOT NULL,
        [service] [varchar] (250) NOT NULL,
        [id] [text] NOT NULL,
        [created_at] [datetime] NOT NULL DEFAULT GETDATE()
    ) TEXTIMAGE_ON [PRIMARY];
    
    CREATE UNIQUE CLUSTERED INDEX [mix_pam] ON [mix_pam] (username, channel, service)
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
    
    go
    

    MS SQL also had some incompatible column types:

    ALTER TABLE [dbo].[muc_online_room] ALTER COLUMN [node] VARCHAR (250);
    ALTER TABLE [dbo].[muc_online_room] ALTER COLUMN [pid] VARCHAR (100);
    ALTER TABLE [dbo].[muc_online_users] ALTER COLUMN [node] VARCHAR (250);
    ALTER TABLE [dbo].[pubsub_node_option] ALTER COLUMN [name] VARCHAR (250);
    ALTER TABLE [dbo].[pubsub_node_option] ALTER COLUMN [val] VARCHAR (250);
    ALTER TABLE [dbo].[pubsub_node] ALTER COLUMN [plugin] VARCHAR (32);
    go
    

    … and mqtt_pub table was incorrectly defined in old schema:

    ALTER TABLE [dbo].[mqtt_pub] DROP CONSTRAINT [i_mqtt_topic_server];
    ALTER TABLE [dbo].[mqtt_pub] DROP COLUMN [server_host];
    ALTER TABLE [dbo].[mqtt_pub] ALTER COLUMN [resource] VARCHAR (250);
    ALTER TABLE [dbo].[mqtt_pub] ALTER COLUMN [topic] VARCHAR (250);
    ALTER TABLE [dbo].[mqtt_pub] ALTER COLUMN [username] VARCHAR (250);
    CREATE UNIQUE CLUSTERED INDEX [dbo].[mqtt_topic] ON [mqtt_pub] (topic)
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
    go
    

    … and sr_group index/PK was inconsistent with other DBs:

    ALTER TABLE [dbo].[sr_group] DROP CONSTRAINT [sr_group_PRIMARY];
    CREATE UNIQUE CLUSTERED INDEX [sr_group_name] ON [sr_group] ([name])
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
    go
    

    ChangeLog

    General

    • New s2s_out_bounce_packet hook
    • Re-allow anonymous connection for connection without client certificates ( #3985 )
    • Stop ejabberd_system_monitor before stopping node
    • captcha_url option now accepts auto value, and it’s the default
    • mod_mam : Add support for XEP-0425: Message Moderation
    • mod_mam_sql : Fix problem with results of mam queries using rsm with max and before
    • mod_muc_rtbl : New module for Real-Time Block List for MUC rooms ( #4017 )
    • mod_roster : Set roster name from XEP-0172, or the stored one ( #1611 )
    • mod_roster : Preliminary support to store extra elements in subscription request ( #840 )
    • mod_pubsub : Pubsub xdata fields max_item/item_expira/children_max use max not infinity
    • mod_vcard_xupdate : Invalidate vcard_xupdate cache on all nodes when vcard is updated

    Admin

    • ext_mod : Improve support for loading *.so files from ext_mod dependencies
    • Improve output in gen_html_doc_for_commands command
    • Fix ejabberdctl output formatting ( #3979 )
    • Log HTTP handler exceptions

    MUC

    • New command get_room_history
    • Persist none role for outcasts
    • Try to populate room history from mam when unhibernating
    • Make mod_muc_room:set_opts process persistent flag first
    • Allow passing affiliations and subscribers to create_room_with_opts command
    • Store state in db in mod_muc:create_room()
    • Make subscribers members by default

    SQL schemas

    • Fix a long standing bug in new schema migration
    • update_sql command: Many improvements in new schema migration
    • update_sql command: Add support to migrate MySQL too
    • Change PostgreSQL SERIAL to BIGSERIAL columns
    • Fix minor SQL schema inconsistencies
    • Remove unnecessary indexes
    • New SQL schema migrate fix

    MS SQL

    • MS SQL schema fixes
    • Add new schema for MS SQL
    • Add MS SQL support for new schema migration
    • Minor MS SQL improvements
    • Fix MS SQL error caused by ORDER BY in subquery

    SQL Tests

    • Add support for running tests on MS SQL
    • Add ability to run tests on upgraded DB
    • Un-deprecate ejabberd_config:set_option/2
    • Use python3 to run extauth.py for tests
    • Correct README for creating test docker MS SQL DB
    • Fix TSQLlint warnings in MSSQL test script

    Testing

    • Fix Shellcheck warnings in shell scripts
    • Fix Remark-lint warnings
    • Fix Prospector and Pylint warnings in test extauth.py
    • Stop testing ejabberd with Erlang/OTP 19.3, as Github Actions no longer supports ubuntu-18.04
    • Test only with oldest OTP supported (20.0), newest stable (25.3) and bleeding edge (26.0-rc2)
    • Upload Common Test logs as artifact in case of failure

    ecs container image

    • Update Alpine to 3.17 to get Erlang/OTP 25 and Elixir 1.14
    • Add tini as runtime init
    • Set ERLANG_NODE fixed to ejabberd@localhost
    • Upload images as artifacts to Github Actions
    • Publish tag images automatically to ghcr.io

    ejabberd container image

    • Update Alpine to 3.17 to get Erlang/OTP 25 and Elixir 1.14
    • Add METHOD to build container using packages ( #3983 )
    • Add tini as runtime init
    • Detect runtime dependencies automatically
    • Remove unused Mix stuff: ejabberd script and static COOKIE
    • Copy captcha scripts to /opt/ejabberd-*/lib like the installers
    • Expose only HOME volume, it contains all the required subdirs
    • ejabberdctl: Don’t use .../releases/COOKIE , it’s no longer included

    Installers

    • make-binaries: Bump versions, e.g. erlang/otp to 25.3
    • make-binaries: Fix building with erlang/otp 25.x
    • make-packages: Fix for installers workflow, which didn’t find lynx

    Full Changelog

    https://github.com/processone/ejabberd/compare/23.01…23.04

    ejabberd 23.04 download & feedback

    As usual, the release is tagged in the git source repository on GitHub .

    The source package and installers are available on the ejabberd Downloads page. To verify the *.asc signature files, see How to verify the integrity of ProcessOne downloads .

    For convenience, there are alternative download locations such as the ejabberd DEB/RPM Packages Repository and the GitHub Release / Tags .

    The ecs container image is available in docker.io/ejabberd/ecs and ghcr.io/processone/ecs . The alternative ejabberd container image is available in ghcr.io/processone/ejabberd .

    If you think you’ve found a bug, please search or file a bug report at GitHub Issues .

    The post ejabberd 23.04 first appeared on ProcessOne .
    • chevron_right

      Sam Whited: Concord and Spring Road Linear Parks

      news.movim.eu / PlanetJabber · Saturday, 15 April, 2023 - 23:00 · 4 minutes

    In my earlier review of Rose Garden and Jonquil public parks I mentioned the Mountain-to-River Trail ( M2R ), a mixed-use bicycle and walking trail that connects the two parks.

    The two parks I’m going to review today are also connected by the M2R trail in addition to the Concord Road Trail , but unlike the previous parks these are linear parks that are integrated directly into the trails!

    Since the linear parks aren’t very large and don’t have much in the way of ammenities to talk about, we’ll veer outside of our Smyrna focus and discuss a few other highlights of the Concord Road Trail and the southern portion of the M2R trail, starting with the Chattahoochee River.

    Paces Mill

    • Amenities: 🏞️ 👟 🥾 🛶 💩 🚲
    • Transportation: 🚍 🚴 🚣

    The southern terminus of the M2R trail is at the Chattahoochee River National Recreation Area ’s Paces Mill Unit. In addition to the paved walking and biking trails, the park has several miles of unpaved hiking trail, fishing, and of course the river itself. Dogs are allowed and bags are available near the entrance. If you head north on the paved Rottenwood Creek Trail you’ll eventually connect to the Palisades West Trails, the Bob Callan Trail, and the Akers Mill East Trail, giving you access to one of the largest connected mixed-use trail systems in the Atlanta area!

    If, instead, you head out of the park to the south on the M2R trail you’ll quickly turn back north into the urban sprawl of the Atlanta suburbs. In approximately 2km you’ll reach the Cumberland Transfer Center where you can catch a bus to most anywhere in Cobb, or transfer to MARTA in Atlanta-proper. At this point the trail also forks for a more direct route to the Silver Comet Trail using the Silver Comet Cumberland Connector trail. We may take that trail another day, but for now we’ll continue north on the M2R trail. Just a bit further north there are also short connector trails to Cobb Galleria Center (an exhibit hall and convention center) and The Battery, a mixed-use development surrounding the Atlanta Braves baseball stadium.

    It’s at this point that the trail turns west along Spring Road where it coincides with the Spring Road Trail that connects to the previously-reviewed Jonquil Park (a total ride of ~3.7km). Shortly thereafter we reach our first actual un-reviewed Smyrna park: the Spring Road Linear Park.

    a map showing bike directions between the CRNRA at Paces Mill and Spring Road Linear Park

    Spring Road Linear Park

    • Amenities: 👟 💩
    • Transportation: 🚍 🚴

    The Spring Road Linear Park stretches 1.1km along the M2R Trail and is easily accessed by both bike (of course) and bus via CobbLinc Route 25 .

    The park does not have a sign or other markers, but does have several nice pull offs with benches that make a good stop over point on your way home to or from the buses at the Cumberland Transfer Center. If you’re out walking the dog public trash cans and dog-poo bags are available on the east end of the park, but do keep in mind that the main trail is mixed-use so dogs should be kept on one side of the trail to avoid incidents with bikes.

    a trail stretches out ahead with a side trail providing access from a neighborhood a small parklet to the side of the trail contains benches and dog poo bags

    After a short climb the trail turns north again and intersects with the Concord Road Trail and the Atlanta Road Trail. We could veer just off the trail near this point to reach Durham Park , the subject of a future review, but instead we’ll continue west, transitioning to the Concord Road Trail to reach our next park: Concord Road Linear Park.

    map of the bike trail between Spring Road Linear Park and Concord Road Linear Park

    Concord Road Linear Park

    • Amenities: 👟 🔧 📚 💩
    • Transportation: 🚍 🚴

    The Concord Road Linear Park sits in the middle of the mid-century Smyrna Heights neighborhood and has something special that’s not often found in poorly designed suburban neighborhoods: (limited) mixed-use zoning! A restaurant and bar (currently seafood) sits at the edge of the park along with a bike repair stand and bike parking.

    It’s worth commending Smyrna for creating this park at all, it may be small but in addition to the mixed-use zoning it did something that’s also not often seen in the burbs: it removed part of Evelyn Street, disconnecting it from the nearest arterial road! In the war-on-cars this is a small but important victory that creates a quality-of-life improvement for everyone in the neighborhood, whether they bike, walk the dog, or just take a stroll over to the restaurants in the town square without having to be molested by cars.

    a street ends and becomes a walking path into a park

    Formerly part of Evelyn Street, now a path

    Silver Comet Concord Road Trail Head

    • Amenities: 🚻 🍳 👟 📚
    • Transportation: 🚴

    In our next review we’ll turn back and continue up the M2R trail to reach a few other parks, but if we were to continue we’d find that the Concord Road Trail continues for another 4km until it terminates at the Silver Comet Trail’s Concord Road Trail Head . This trail head sits at mile marker 2.6 on the Silver Comet Trail, right by the Concord Covered Bridge Historic District .

    The Silver Comet will likely be covered in future posts, so for now I’ll leave it there. Thanks for bearing with me while we take a detour away from the City of Smyrna’s parks, next time the majority of the post will be about parks within the city, I promise.

    map of the bike trail between Concord Road Linear Park and the Silver Comet Concord Road Trail Head
    • wifi_tethering open_in_new

      This post is public

      blog.samwhited.com /2023/04/concord-and-spring-road-linear-parks/

    • chevron_right

      Erlang Solutions: Optimización para lograr concurrencia: comparación y contraste de las máquinas virtuales BEAM y JVM

      news.movim.eu / PlanetJabber · Wednesday, 12 April, 2023 - 10:11 · 17 minutes

    En esta nota exploraremos los aspectos internos de la máquina virtual BEAM o VM por sus siglas en inglés (Virtual Machine). Y haremos una comparación con la máquina virtual de Java, la JVM.

    El éxito de cualquier lenguaje de programación en el ecosistema Erlang puede ser repartido a tres componentes estrechamente acoplados:

    1. la semántica del lenguaje de programación Erlang, que es la base sobre la cual otros lenguajes están implementados
    2. las bibliotecas OTP y middleware usados para construir arquitecturas escalabels y sistemas concurrentes y resilientes y
    3. la máquina virtual BEAM, estrechamente acoplada a la semántica del lenguaje y OPT.

    Toma cualquiera de estos componentes por si solo y tendras a un potencial ganador. Pero si consideras a los tres juntos, tendrás a un ganador indiscutible para el desarrollo de sistemas escalables, resilientes y soft-real time. Citando a Joe Armstrong:

    “Puedes copiar las bibliotecas de Erlang, pero si no corren en la BEAM, no puedes emular la semánticas”

    Esta idea es reforzada por la primera regla de programación de Robert Virding, que establece que “Cualquier programa concurrente escrito en otro lenguaje y que sea lo suficientemente complejo, contiene una implementación ad hoc, específicada informalmente, lenta y plagada de errores, de la mitad de Erlang.”

    En esta nota vamos a explorar los aspectos internos de la máquina virtual BEAM. Compararemos algunos de ellos con la JVM, señalando las razones por las que deberías poner especial atención en ellos. Por mucho tiempo estos componentes han sido tratados como una caja negra, y confiamos ciegamente en ellos sin entender que hay detrás. Es tiempo de cambiar eso!

    Aspectos relevantes de la BEAM

    Erlang y la máquina virtual BEAM fueron inventados para tener una herramienta que resolviera un problema específico. Fueron desarrollados por Ericsson para ayudar a implementar la infraestructura de un sistema de telecomunicaciones que manejara redes fijas y móviles. Esta infraestructura por naturaleza es altamente concurrente y escalable. Tiene que funcionar en tiempo real y posiblemente nunca presentar fallas. No queremos que nuestra llamada de Hangouts con nuestra abuela de pronto terminé por un error, o estar en un juego en línea como Fortnite y que se interrumpa porque le tienen que hacer actualizaciones. La máquina virtual BEAM está optimizada para resolver muchos de estos retos, gracias a características que funcionan con un modelo de programación concurrente predecible.

    La receta secreta son los procesos Erlang, que son ligeros, no comparten memoria y son administrados por schedulers capaces de manejar millones a través de múltiples procesadores. Utiliza un recolector de basura que corre en un proceso por si mismo y está altamente optimizado para reducir el impacto en otros procesos. Como resultado de esto, el recolector de basura no afecta las propiedades globales en tiempo real del sistema. La BEAM es también la única máquina virtual utilizada ampliamente a escala con un modelo de distribución hecho a la medida, que permite a un programa ejecutarse en múltiples máquinas de manera transparente.

    Aspectos relevantes de la JVM

    La máquina virtual de Java o JVM por sus siglas en inglés (Java Virtual Machine) fue desarrollada por Sun Microsystem en un intento de proveer un plataforma en la que “ escribes código una vez ” y corre en donde sea. Crearon un lenguaje orientado a objetos, similar a C++ pero que fuera memory-safe ya que la detección de errores en tiempo de ejecución revisa los límites de los arreglos y las desreferencias de punteros. El ecosistema JVM se volvió extremamente popular en la era del Internet, convirtiéndose un estándar de-facto para el desarrollo de aplicaciones de servidores empresariales. El amplio rango de aplicabilidad fue posible gracias a una máquina virtual que se adapta a muchos  casos de uso y a un impresionante conjunto de bibliotecas que se adaptan al desarrollo empresarial.

    La JVM fue diseñada pensando en eficiencia. La mayoría de sus conceptos son una abstracción de características encontradas en populares sistemas operativos, como el modelo de hilos, similar al manejo de hilos o threads del sistema operativo. La JVM es altamente personalizable, incluyendo el recolector de basura y class loaders. Algunas implementaciones del recolector de basura de última generación brindan características ajustables que se adaptan a un modelo de programación basado en memoria compartida.

    La JVM le permite modificar el código mientras se ejecuta el programa. Y, un compilador JIT permite que el código de bytes se compile en el código de máquina nativo con la intención de acelerar partes de la aplicación.

    La concurrencia en el mundo de Java se relaciona principalmente con la ejecución de aplicaciones en subprocesos paralelos, lo que garantiza que sean rápidos. La programación con primitivas de concurrencia es una tarea difícil debido a los desafíos creados por su modelo de memoria compartida. Para superar estas dificultades, existen intentos de simplificar y unificar los modelos de programación concurrentes, como el marco Akka, que es el intento más exitoso.

    Concurrencia y Paralelismo

    Cuando hablamos de ejecución de código paralela, nos referismo a que partes del código se ejecutan al mismo tiempo en múltiples procesadores, o computadoras, mientras que programación concurrente se refiere al manejo de eventos que llegan de forma independiente. Una ejecución concurrente se puede simular en hardware de un solo subproceso, mientras que la ejecución en paralelo no. Aunque esta distinción puede parecer pedante, la diferencia da como resultado problemas por resolver con enfoques muy diferentes. Piense en muchos cocineros que preparan un plato de pasta carbonara. En el enfoque paralelo, las tareas se dividen entre la cantidad de cocineros disponibles, y una sola parte se completaría tan rápido como les tome a estos cocineros completar sus tareas específicas. En un mundo concurrente, obtendría una porción para cada cocinero, donde cada cocinero hace todas las tareas.

    Utilice el paralelismo para velocidad y la concurrencia para escalabilidad.

    La ejecución en paralelo intenta resolver una descomposición óptima del problema en partes independientes entre sí. Hervir el agua, sacar la pasta, mezclar el huevo, freír el jamón, rallar el queso. Los datos compartidos (o en nuestro ejemplo, el plato a servir) se manejan mediante bloqueos, mutexes y otras técnicas que garantizan la correcta ejecución. Otra forma de ver esto es que los datos (o ingredientes) están presentes y queremos utilizar tantos recursos de CPU paralelos como sea posible para terminar el trabajo lo más rápido que se pueda.

    La programación concurrente, por otro lado, trata con muchos eventos que llegan al sistema en diferentes momentos y trata de procesarlos todos dentro de un tiempo razonable. En arquitecturas multi-procesadores o distribuidas, parte de la ejecución se lleva a cabo en paralelo, pero esto no es un requisito. Otra forma de verlo es que el mismo cocinero hierve el agua, saca la pasta, mezcla los huevos, etc., siguiendo un algoritmo secuencial que es siempre el mismo. Lo que cambia entre procesos (o cocciones) son los datos (o ingredientes) en los que trabajar, que existen en múltiples instancias.

    La JVM está diseñada para el paralelismo, la BEAM para la concurrencia. Son dos problemas intrínsecamente diferentes, que requieren soluciones diferentes.

    La BEAM y la concurrencia

    La BEAM proporciona procesos ligeros para dar contexto al código en ejecución. Estos procesos, también llamados actores, no comparten memoria, sino que se comunican a través del paso de mensajes, copiando datos de un proceso a otro. El paso de mensajes es una característica que la máquina virtual implementa a través de buzones de correo que tienen los procesos individualmente. El paso de mensajes es una operación no-bloqueante, lo que significa que enviar un mensaje de un proceso a otro otro es casi instantáneo y la ejecución del remitente no se bloquea. Los mensajes enviados tienen la forma de datos inmutables, copiados de la pila del proceso remitente al buzón del proceso receptor. Esto se logra sin necesidad de bloqueos y mutexes entre los procesos, el único bloqueo en el buzón o mailbox es en caso de que varios procesos envíen un mensaje al mismo destinatario en paralelo.

    Los datos inmutables y el paso de mensajes permiten al programador escribir procesos que funcionan de forma independiente y que se centran en la funcionalidad en lugar del manejo de bajo nivel de la memoria y la programación de tareas. Este diseño simple no solo funciona en un solo proceso, sino también en múltiples threads en una máquina local que se ejecuta en la misma VM y utilizando la distribución integrada, a través de la red con un grupo de VMs y máquinas. Si los mensajes son inmutables entre procesos, se pueden enviar a otro subproceso (o máquina) sin bloqueos, escalando casi linealmente en arquitecturas distribuidas de varios procesadores. Los procesos se abordan de la misma manera en una VM local que en un clúster de VM, el envío de mensajes funciona de manera transparente, independientemente de la ubicación del proceso de recepción.

    Los procesos no comparten memoria, permitiendo replicar los datos para mayor resiliencia, y distribuirlos para escalar. Esto significa que se pueden tener dos instancias del mismo proceso en dos máquinas separadas, compartiendo una actualización de estado entre ellas. Si una máquina falla entonces la otra tiene una copia de los datos y puede continuar manejando la solicitud, haciendo el sistema tolerante a fallas. Si ambas máquinas están operando, ambos procesos pueden manejar peticiones, brindando así escalabilidad. La BEAM proporcional primitivas altamente optmizadas para que todo esto funcione sin problemas, mientras que OTP (la biblioteca estándar ) proporciona las construcciones de nivel superior para facilitar la vida de los programadores.

    Akka hace un gran trabajo al replicar las construcciones de nivel superior, pero esta de alguna manera limitado por la falta de primitivas proporcionadas por la JVm, permitiendo estar altamente optimizada para concurrencia. Si bien las primitivas de la JVM permiten una gama más amplia de casos de uso, hacen el desarrollo de sistemas distribuidos más complicado al no tener características por default para la comunicación y a menudo se basan en un modelo de memoria compartida. Por ejemplo, ¿en qué parte de un sistema distribuido coloca memoria compartida? ¿Y cuál es el costo de acceder a ella?

    Scheduler

    Mencionamos que una de las características más fuertes de la BEAM es la capacidad de dividir un programa en procesos pequeños y livianos. La gestión de estos procesos es tarea del scheduler . A diferencia de la JVM, que asigna sus subprocesos a threads del sistema operativo y deja que este los administre, la BEAM viene con su propio scheduler o administrador.

    El scheduler inicia por default un hilo del sistema operativo ( OS thread ) por cada procesador de la máquina y optimiza la carga entre ellos. Cada proceso consiste en código que será ejecutado y un estado que cambia con el tiempo. El scheduler escoge el primer proceso en la cola de ejecución que esté listo para correr, le asigna una cierta cantidad de reducciones para ejecutarse, donde cada reducción es el equivalente aproximado a un comando. Una vez que el proceso se ha quedado sin reducciones, sera bloqueado por I/O, y se queda esperando un mensaje o que pueda completar su ejecución, el scheduler escoge el siguiente proceso en la cola y lo despacha. Esta técnica es llamada preventiva.

    Mencionamos el framework Akka varias veces, ya que su más grande desventaja es la necesidad de anotar el código con scheduling points, ya que la administración no está dada a nivel de la JVM. Al quitar el control de las manos del programador, las propiedades en tiempo real son preservadas y garantizadas, ya que no hay riesgo de que accidentalmente se provoque inanición del proceso.

    Los procesos pueden ser esparcidos a todos los hilos disponiblews del scheduler y maximizar el uso de CPU. Hay muchas maneras de modificar el scheduler pero es raro y solo será requerido en ciertos casos límite, ya que las opciones predeterminadas cubren la mayoría de los patrones de uso.

    Hay un tema sensible que aparece con frequencia con respecto a los schedulers: como manejar funciones implementadas nativamente, o NIFs por sus siglas en inglés (Natively Implemented Functions). Un NIF es un fragmento de código escrito en C, compilado como una biblioteca y ejecutado en el mismo espacio de memoria que la BEAM para mayor velocidad. El problema con los NIF es que no son preventivos y pueden afectar a los schedulers. En versiones recientes de la BEAM, se agregó una nueva función, dirty schedulers , para brindar un mejor control de los NIF. Los dirty schedulers son schedulers separados que se ejecutan en diferentes subprocesos para minimizar la interrupción que puede causar un NIF en un sistema. La palabra dirty se refiere a la naturaleza del código que ejecutan estos schedulers.

    Recolector de Basura

    Los lenguajes de programación modernos utilizan un recolector de basura para el manejo de memoria. Los lenguajes en la BEAM no son la excepción. Confiar en la máquina virtual para manejar los recursos y administrar la memoria es muy útil cuando desea escribir código concurrente de alto nivel, ya que simplifica la tarea. La implementación subyacente del recolector de basura es bastante sencilla y eficiente, gracias al modelo de memoria basado en el estado inmutable. Los datos se copian, no se modifican y el hecho de que los procesos no compartan memoria elimina las interdependencias de los procesos que por consiguiente, no necesitan ser administradas.

    Otra característica del recolecto de basura de la BEAM es que solamente se ejecuta cuando es necesario, en un proceso por si solo, sin afectar otros procesos esperando en la cola de ejecución. Por lo tanto, el recolector de basura en Erlang no detine el mundo. Evita picos de latencia en el procesamiento, porque la máquina virtual nunca se detiene como un todo, solo se detienen procesos específicos, y nunca todos al mismo tiempo. En la práctica, es solo parte de lo que hace un proceso y se trata como otra reducción. El recolector de basura suspende el proceso por un intervalo muy corto, hablamos de microsegundos. Como resultado, habrá muchas ráfagas pequeñas, que se activarán solo cuando el proceso necesite más memoria. Un solo proceso generalmente no tiene asignadas grandes cantidades de memoria y, a menudo, es de corta duración, lo que reduce aún más el impacto al liberar inmediatamente toda su memoria. Una característica de la JVM es la capacidad de intercambiar recolectores de basura, así que al usar un recolector comercial, también es posible lograr un recolector continuo o non-stopping en la JVM.

    Las características del recolector de basura son discutidas en este excelente post por Lukas Larsson. Hay muchos detalles intrincados, pero está optimizada para manejar datos inmutables de manera eficiente, dividiendo los datos entre la pila y el heap para cada proceso. El mejor enfoque es hacer la mayor parte del trabajo en procesos de corta duración.

    Una pregunta que surge a menudo sobre este tema es cuánta memoria usa la BEAM. Si indgamos un poco, la máquina virtual asigna grandes porciones de memoria y utiliza allocators personalizados para almacenar los datos de manera eficiente y minimizar la sobrecarga de las llamadas al sistema. Esto tiene dos efectos visibles:

    1) La memoria utilizada disminuye gradualmente después de que no se necesita espacio

    2) La reasignación de grandes cantidades de datos podría significar duplicar la memoria de trabajo actual.

    El primer efecto puede, si es realmente necesario, mitigarse ajustando las estrategias del allocator . El segundo es fácil de monitorear y planificar si tiene visibilidad de los diferentes tipos de uso de la memoria. (Una de esas herramientas de monitoreo que proporciona métricas del sistema listas para usar es WombatOAM ).

    Hot Code Loading

    La carga de código en caliente o  hot code loading es probablemente la característica única más citada de BEAM. La carga de código en caliente significa que la lógica de la aplicación se puede actualizar cambiando el código ejecutable en el sistema mientras se conserva el estado del proceso interno. Esto se logra reemplazando los archivos BEAM cargados e instruyendo a la máquina virtual para que reemplace las referencias del código en los procesos en ejecución.

    Es una característica fundamental para garantizar que no habrá tiempo de inactividad en una infraestructura de telecomunicaciones, donde se utilizó hardware redundante para manejar los picos. Hoy en día, en la era de la contenerización, también se utilizan otras técnicas hacer actualizaciones a un sistema en producción. Aquellos que nunca lo han usado o requerido, lo descartan como una característica no tan importante, pero no hay que subestimarla en el flujo de trabajo de desarrollo. Los desarrolladores pueden iterar más rápido reemplazando parte de su código sin tener que reiniciar el sistema para probarlo. Incluso si la aplicación no está diseñada para ser actualizable en producción, esto puede reducir el tiempo necesario para volver a compilar y re-lanzar el sistema.

    Cuando no usar la BEAM

    Se trata en gran medida de saber escoger la herramienta adecuada para el trabajo.

    ¿Necesita un sistema que sea extremadamente rápido, pero no le preocupa la concurrencia? ¿Quiere manejar algunos eventos en paralelo de manera rápida? ¿Necesita procesar números para gráficos, IA o análisis? Siga la ruta de C++, Python o Java. La infraestructura de telecomunicaciones no necesita operaciones rápidas con floats , por lo que la velocidad nunca fue una prioridad. Con el tipado dinámico, que tiene que hacer todas las comprobaciones de tipo en tiempo de ejecución, las optimizaciones en el compilador no son tan triviales. Por lo tanto, es mejor dejar el procesamiento de números en manos de la JVM, Go u otros lenguajes que compilan de forma nativa. No es de sorprender que las operaciones de coma flotante en Erjang, la versión de Erlang que se ejecuta en la JVM, sean un 5000 % más rápidas que en la BEAM. Por otro lado, en donde hemos visto realmente brillar a la BEAM es en el uso de su concurrencia para orquestar el procesamiento de números, subcontratando el análisis a C, Julia, Python o Rust. Haces el mapa fuera de la BEAM y la reducción dentro de ella.

    El mantra siempre ha sido lo suficientemente rápido. Los humanos tardan unos cientos de milisegundos en percibir un estímulo (un evento) y procesarlo en su cerebro, lo que significa que el tiempo de respuesta de micro o nano segundos no es necesario para muchas aplicaciones. Tampoco es recomendable usar la BEAM para microcontroladores, ya que consume demasiados recursos. Pero para los sistemas integrados con un poco más de potencia de procesamiento, donde los multi-procesadores se están convirtiendo en la norma, se necesita concurrencia y la BEAM brilla ahí. En los años 90, estábamos implementando conmutadores de telefonía que manejaban decenas de miles de suscriptores que se ejecutaban en placas integradas con 16 MB de memoria. ¿Cuánta memoria tiene un RaspberryPi en estos días?

    Y por último, hard-real-time . Probablemente no quiera que la BEAM administre su sistema de control de bolsas de aire. Necesita garantías sólidas, un sistema operativo en tiempo real y un lenguaje sin recolección de basura ni excepciones. Una implementación de una máquina virtual de Erlang que se ejecuta en el metal, como GRiSP, le brindará garantías similares.

    Conclusión

    Utilice la herramienta adecuada para el trabajo.

    Si está escribiendo un sistema soft-real time que tiene que escalar fuera de la caja y nunca fallar, y hacerlo sin la molestia de tener que reinventar la rueda, definitivamente la BEAM es la tecnología que está buscando. Para muchos, funciona como una caja negra. No saber cómo funciona sería como conducir un Ferrari y no ser capaz de lograr un rendimiento óptimo o no entender de qué parte del motor proviene ese extraño sonido. Es por eso que es importante aprender más sobre la BEAM, comprender su funcionamiento interno y estar listo para ajustarlo. Para aquellos que han usado Erlang y Elixir con ira, hemos lanzado un curso dirigido por un instructor de un día que desmitificará y explicará mucho de lo que vio mientras lo prepara para manejar la concurrencia masiva a escala. El curso está disponible a través de nuestra nueva capacitación remota dirigida por un instructor; obtenga más información aquí . También recomendamos el libro The BEAM de Erik Stenman y BEAM Wisdoms , una colección de artículos de Dmytro Lytovchenko.

    The post Optimización para lograr concurrencia: comparación y contraste de las máquinas virtuales BEAM y JVM appeared first on Erlang Solutions .

    • wifi_tethering open_in_new

      This post is public

      www.erlang-solutions.com /blog/optimizacion-para-lograr-concurrencia-comparacion-y-contraste-de-las-maquinas-virtuales-beam-y-jvm/

    • chevron_right

      JMP: Verify Google Play App Purchase on Your Server

      news.movim.eu / PlanetJabber · Tuesday, 11 April, 2023 - 14:59 · 5 minutes

    We are preparing for the first-ever Google Play Store launch of Cheogram Android as part of JMP coming out of beta later this year.  One of the things we wanted to “just work” for Google Play users is to be able to pay for the app and get their first month of JMP “bundled” into that purchase price, to smooth the common onboarding experience.  So how do the JMP servers know that the app communicating with them is running a version of the app bought from Google Play as opposed to our builds, F-Droid’s builds, or someone’s own builds?  And also ensure that this person hasn’t already got a bundled month before?  The documentation available on how to do this is surprisingly sparse, so let’s do this together.

    Client Side

    Google publishes an official Licensing Verification Library for communicating with Google Play from inside an Android app to determine if this install of the app can be associated with a Google Play purchase.  Most existing documentation focuses on using this library, however it does not expose anything in the callbacks other than “yes license verified” or “no, not verified”.  This can allow an app to check if it is a purchased copy itself, but is not so useful for communicating that proof onward to a server.  The library also contains some exciting snippets like:

    // Base64 encoded -
    // com.android.vending.licensing.ILicensingService
    // Consider encoding this in another way in your
    // code to imp rove security
    Base64.decode(
        "Y29tLmFuZHJvaWQudmVuZGluZy5saWNlbnNpbmcuSUxpY2Vuc2luZ1NlcnZpY2U=")))

    Which implies that they expect developers to fork this code to use it.  Digging in to the code we find in LicenseValidator.java:

    public void verify(PublicKey publicKey, int responseCode, String signedData, String signature)

    Which looks like exactly what we need: the actual signed assertion from Google Play and the signature!  So we just need a small patch to pass those along to the callback as well as the response code currently being passed.  Then we can use the excellent jitpack to include the forked library in our app:

    implementation 'com.github.singpolyma:play-licensing:1c637ea03c'

    Then we write a small class in our app code to actually use it:

    import android.content.Context;
    import com.google.android.vending.licensing.*;
    import java.util.function.BiConsumer;
    
    public class CheogramLicenseChecker implements LicenseCheckerCallback {
        private final LicenseChecker mChecker;
        private final BiConsumer mCallback;
    
        public CheogramLicenseChecker(Context context, BiConsumer<String, String> callback) {
            mChecker = new LicenseChecker(  
                context,  
                new StrictPolicy(), // Want to get a signed item every time  
                context.getResources().getString(R.string.licensePublicKey)  
            );
            mCallback = callback;
        }
    
        public void checkLicense() {
            mChecker.checkAccess(this);
        }
    
        @Override
        public void dontAllow(int reason) {
            mCallback.accept(null, null);
        }
    
        @Override
        public void applicationError(int errorCode) {
            mCallback.accept(null, null);
        }
    
        @Override
        public void allow(int reason, ResponseData data, String signedData, String signature) {
            mCallback.accept(signedData, signature);
        }
    }

    Here we use the StrictPolicy from the License Verification Library because we want to get a fresh signed data every time, and if the device is offline the whole question is moot because we won’t be able to contact the server anyway.

    This code assumes you put the Base64 encoded licensing public key from “Monetisation Setup” in Play Console into a resource R.string.licensePublicKey .

    Then we need to communicate this to the server, which you can do whatever way makes sense for your protocol; with XMPP we can easily add custom elements to our existing requests so:

    new com.cheogram.android.CheogramLicenseChecker(context, (signedData, signature) -> {
        if (signedData != null && signature != null) {
            c.addChild("license", "https://ns.cheogram.com/google-play").setContent(signedData);
            c.addChild("licenseSignature", "https://ns.cheogram.com/google-play").setContent(signature);
        }
    
        xmppConnectionService.sendIqPacket(getAccount(), packet, (a, iq) -> {
            session.updateWithResponse(iq);
        });
    }).checkLicense();

    Server Side

    When trying to verify this on the server side we quickly run into some new issues.  What format is this public key in?  It just says “public key” and is Base64 but that’s about it.  What signature algorithm is used for the signed data?  What is the format of the data itself?  Back to the library code!

    private static final String KEY_FACTORY_ALGORITHM = "RSA";
    …
    byte[] decodedKey = Base64.decode(encodedPublicKey);
    …
    new X509EncodedKeySpec(decodedKey)

    So we can see it is an X509 related encoded, and indeed turns out to be Base64 encoded DER.  So we can run this:

    echo "BASE64_STRING" | base64 -d | openssl rsa -pubin -inform der -in - -text

    to get the raw properties we might need for any library (key size, modulus, and exponent).  Of course, if your library supports parsing DER directly you can also use that.

    import java.security.Signature;
    …
    private static final String SIGNATURE_ALGORITHM = "SHA1withRSA";
    …
    Signature sig = Signature.getInstance(SIGNATURE_ALGORITHM);
    sig.initVerify(publicKey);
    sig.update(signedData.getBytes());

    Combined with the java documentation we can thus say that the signature algoritm is PKCS#1 padded RSA with SHA1.

    And finally:

    String[] fields = TextUtils.split(mainData, Pattern.quote("|"));
    data.responseCode = Integer.parseInt(fields[0]);
    data.nonce = Integer.parseInt(fields[1]);
    data.packageName = fields[2];
    data.versionCode = fields[3];
    // Application-specific user identifier.
    data.userId = fields[4];
    data.timestamp = Long.parseLong(fields[5]);

    The format of the data, pipe-seperated text. The main field of interest for us is userId which is (as it says in a comment) “a user identifier unique to the <application, user> pair”. So in our server code:

    import Control.Error (atZ)
    import qualified Data.ByteString.Base64 as Base64
    import qualified Data.Text as T
    import Crypto.Hash.Algorithms (SHA1(SHA1))
    import qualified Crypto.PubKey.RSA as RSA
    import qualified Crypto.PubKey.RSA.PKCS15 as RSA
    import qualified Data.XML.Types as XML
    
    googlePlayUserId
        | googlePlayVerified = (T.split (=='|') googlePlayLicense) `atZ` 4
        | otherwise = Nothing
    googlePlayVerified = fromMaybe False $ fmap (\pubKey ->
        RSA.verify (Just SHA1) pubKey (encodeUtf8 googlePlayLicense)
            (Base64.decodeLenient $ encodeUtf8 googlePlaySig)
        ) googlePlayPublicKey
    googlePlayLicense = mconcat $ XML.elementText
        =<< XML.isNamed (s"{https://ns.cheogram.com/google-play}license")
        =<< XML.elementChildren payload
    googlePlaySig = mconcat $ XML.elementText
        =<< XML.isNamed (s"{https://ns.cheogram.com/google-play}licenseSignature")
        =<< XML.elementChildren payload

    We can then use the verified and extracted googlePlayUserId value to check if this user has got a bundled month before and, if not, to provide them with one during signup.

    • wifi_tethering open_in_new

      This post is public

      blog.jmp.chat /b/play-purchase-verification-2023